Monthly Shaarli
March, 2021
Scrapes google translate and presents a clean, alternative frontend for it.
(Similar to what e.g. newpipe does for youtube)
An archive of word versions, from windows through Mac and ms-dos to unix versions. Very very interesting.
@danielhglus Does this count? https://regexone.com/ is a list of lessons where you go through and defeat challenges by making regex pass test cases. It starts really simple and goes pretty far. I don't think it does regex-but-not-regex stuff like negative lookahead and stuff though.
Yesterday was a great day for the vim universe, especially if you write as many LaTeX documents as I do: a new version of vimTeX was released.
This gave me a reason to dive into its documentation once again and so I found a feature that I didn't even know existed: a grammar check in LaTeX.
How it works
To use this, you will need to download the following software:
Language Tool
For some years, there has been an amazing free software project to check natural languages: Language Tool (LT).
However, this works on plain text files and can't filter out the LaTex stuff.
YaLafi
Enter Yet Another LaTeX filter (YaLafi).
This tool extracts the plain text and performs the grammar check using LT, but it keeps track of the original position of the text!
VimTeX integration
YaLafi provides a vim compiler called vlty (I don't know where this name comes from) to use these tools and VimTeX makes configuration ridiculously easy: see :h vimtex-grammar-vlty
.
Once everything is installed and you linked vimTeX to the directory containing the LT .jar file, you can populate and open the quickfixlist as follows:
:compiler vlty | make | copen
(It takes a few seconds.)
Issues
Now, obviously I just have been trying this out for a day, but so far I'm impressed to say the least.
However, in my opinion the YaLafi documentation is not always clear/easy to get started with:
- how to handle multi-file documents
- how to suppress certain warnings
- how to handle user-defined macros
- ...
Getting to know both LT and YaLafi better will probably solve my problems.
Conclusion
Awesome software:
- vim
- vimTeX
- YaLafi
- Language Tool
I hope some of you can integrate this into your workflow and maybe this post inspires you to make some PR to YaLafi or to write tutorials for less technical users like me to use the advanced options.
A colleague was saying the other day, that deep learning models are like sports cars - they need a minimum distance to accelerate before they can reach their top speed. The same way, deep learning models don't perform well on smaller data sets where there is no room (i.e. not enough data) to rev their engine. Thats why a mountain bike (e.g. CART decision tree) can navigate a trail in a forest compared to a Ferrari (e.g. convoloution neural network).
I really liked my colleagues analogy, but is there any math theory to support what they are saying? Are complex models (e.g. neural networks, svm) naturally (through their mathematical architecture) more susceptible to overfitting than a logistic regression or decision tree when exposed to smaller data? I feel there is an unspoken rule: "in general, use complicated models on complicated data". But is there any mathematical justification to support this?
I understand that sometimes deep learning models perform poorly because the analyst might not know how to use them properly (e.g. hyperparameter tuning) - but this doesn't reflect the model itself.
I know there is a theorem called the "no free lunch theorem" that shows by default, "there is no single best algorithm for all problem" - but can this theorem be used to somehow justify that smaller datasets don't require conplex models? I.e. is there some way to show that more complex models (e.g. suppose we quantify model complexity through the VC dimensionality) dont necessarily produce lower generalization error on smaller datasets?
So, given a very powerful computer that can simultaneously consider millions of hyperparameter combinations: can it be statistically shown that more complex models are not necessarily better for smaller data sets (e.g. iris data)?
Thanks
Tag your time, get the insight. traggo alternative, pretty similar but not based on key:value pairs #tags instead.
More mature interface and reporting functionality, less extensive dashboarding possibilities.
Simple and elegant progressive vim tutorial - starts out with very basic information and then becomes progressively more 'difficult', ending up with some basic macros and block visual selection tips.
Should be a good one for introducing new people to vim.
Github cicd based workflow for static website blog comments
Creating super simple GUI image viewing (with editing and layering possibilities) with just a few lines of python.
Uses qt as its display library (the only one right now afaik?)
Ok this is slightly insane. OpenType (the common font format) actually supports simple scripts inside the font for complex characters and such. So a guy WROTE AN ENTIRE GAME INSIDE A FONT called Fontemon that you play by typing letters on the keyboard. https://www.coderelay.io/fontemon.html#player
š¦š: https://twitter.com/d_feldman/status/1372778427955044356
Some explanation and demonstration of the power of LuaTeX (or LuaLaTeX), from 2017.
An absolutely amazing guide and introduction into NodeRED programming (and the idea of flow-based and IoT programming in general).
It is sorted by examples and the chapters go deep into creating useful flows. Really, really good.
Somewhat similar to traefik, somewhat of a successor for many.
Is generally easier to configure, and carries more 'batteries-included' stuff with it.
Should work just as fine for docker containers and swarms, and can be integrated with plugins for all kinds of things (e.g. consul for ssl certs)
Python library to manipulate PDF page labels.
Meaning: you can change the labelling of page sections (preface getting roman numerals and different numbering scheme than main section, etc)
Should be really useful for later on programatically going to the correct pages and extracting the correct page citations from annotations etc
Executes alongside nextcloud to run arbitrary shell scripts as cronjobs
Pandoc preprocessor/wrapper to consume, display, merge and diff criticmarkup (i.e. track changes mode).
Simple date-time parser, with nice unix-y philosophy.
Demo shows how it can be integrated into e.g. vim workflow for switching between different representations of dates,
and moving dates up, creating diaries with hundreds of entries, etc.
Funny little website which, while taking itself not too seriously, contains some good advice.
7 principles of getting stuff done - following something similar to the pareto principle, deep-work ideas, and so on.
Using pipe2 functionality, could presumably be adapted for newsboat, aerc, etc.
Dropbear does not support ED25519 keys. It will simply ignore them in the keyfile.
Python bindings for bibtex.
Can be used either as a cmdline replacement for bibtex, or, more practically, as a python library to parse and interact with bib files.
Careful, however, since some mention that it rewrites, and sometimes messes up? Still have to investigate before letting it loose on my actual bibtex libraries.
# If Python version returned above is 3.X
python3 -m http.server
# On windows try "python" instead of "python3", or "py -3"
# If Python version returned above is 2.X
python -m SimpleHTTPServer
In-depth explanation of using the dig
command in Linux.
Shows examples, grabbing dns, authoritative nameservers, etc.
This bash tutorial presents a comprehensive list of useful string manipulation tips for bash scripting.
Bash-only, but really useful to avoid using sed or similar external programs.
This is a new web-based photo management application. Run it on your home server and it will let you find the right photo from your collection on any device. Smart filtering is made possible by obj...
You can find a rather extensive list here, with maybe Castl the closest to JS?
Contains moonscript and other usual suspects but also more esoteric ones like haskell, go cross compilers.
Exhaustive list, with pros and cons, of access to nextcloud files.
Includes a little 'push' function to send stuff directly to your nextcloud folders which could be adapted to different use cases.
Not mentioned: S3 primary storage on Nextcloud
Huge collection of computable, curated data from demographics to language, science & math, politics, social media. Many formats: numerical, time series, image, audio, geospatial. Can be exported as simple csv, or worked with in python notebooks on the page.
Research Guides: Social Science Data Sources & Statistical Methods: Free Data Sources
A collection of free sources of various kinds of data, on recommendation-basis from EMU.
Oooh, DeepL has added 13 European languages! Including Danish, Swedish, Hungarian, ā¦
DeepL quality consistently outperforms Google Translate. It's also great because DeepL doesn't use English as common ground between non-English languages. https://www.deepl.com/blog/20210316.html
Attaching to the same tmux session multiple times - and using different windows and views into it.
Build log of Spotifypod: Spotify in a 4th-gen iPod (2004) made using an RPi Zero reusing the original clickwheel. Schematics and 3D files are included!.
Really nice build and update of vintage iPod.
Self-hosted DNS routing and service, including both options for simple and painless routing and advanced options, e.g. REST api access.
Seems quite young still, however, and does not support a lot of domain name registrars yet.
Grafana is an amazing visualization tool used mainly by IT teams to monitor their infrastructure. As itās open-source thereās huge contribution from the community on both datasource and panel makingā¦
lsync can act as a repeatable rsync replacement. Under the hood it utilizes rsync (though there is an advanced rsyncssh integration available which does not re-transfer files over rsync that already exist on the target machine).
Ideal for local-remote scenarios, where changes occur on one machine and should be replicated on another (e.g. mirroring project directory and code changes, automatically pushing them to remote development environment for compilation/testing/building)
DigitalOcean guide here
git bisect usage:
git bisect start
> puts you into bisect mode
git bisect bad <commitref|HEAD>
> signal to git the commit where things definitely don't work right anymore (mostly HEAD)
git bisect good <commitref|HEAD>
> signal to git the commit where things definitely did work right still (can be any number of commits back).
You will no be put into a special bisect
inspection mode and git will checkout various commits in a divide and conquer manner for you to declare their status.
git bisect good|bad
> After checking the state of the app/running tests/whatever, declare the commit as working or not-working. Git will then move you to the next commit until you find the right one introducing the bug.
git bisect reset
> get out of bisect mode
I need to produce a version of a document where the changes compared to an earlier version are marked (as red text, specifically). Both are markdown. What is my best option?
The first thing that occurs to me is to turn them both into LaTeX and use latexdiff. Is there a better way?
Using nginx to limit the open endpoints
Includes gui and cli options, but as far as I can see not more involved setups like https://github.com/jonaswinkler/paperless-ng .
JSONata query and transformation language. Small introduction.
Comprehensive information on connecting to wireguard VPN servers through NetworkManager
This post describes the GnuPG pinentry process and provides a script which automatically chooses between a terminal or graphical interface based on the PINENTRY_USER_DATA environment variable.
A neat introduction to the way pinentry works. (Or seems to work, I have not done my due diligence here)
Edit and view the structure of PDFs, on the commandline or through JSON and thus gain all sorts of useful information to the makeup of a pdf.
Removes a (n X11) window when another program is launched and thus 'replaces' it with the other window.
Then re-establishes the original window when the first one closes.
Could be kinda neat for things like launching mpv in-place, or switching between fileview and filecontents (e.g. pdf reading)
A map visualization working against the Mercator projection bias of Africa's immense size.
An illustrated guide to explain OAuth and OpenID Connect!
date-time parser for go. Similar to dateparse for python, just for Go - and I'm not sure if it supports as many natural formats, or languages.
"A new kind of emoji set. Mutant Standard breaks the traditional conventions of emoji to give you more inclusivity, choice and freedom."
Really nice and clear emoji set.
locate replacement, intended to be used in scripts etc
Could be a really nice all-round commandline solution for any time you need to build a database of stuff.
Got a lot of pdf files to read? Got music files? Videos? dotfiles? index anything and locate it later.
Setting up dropbear, especially to only take public ssh keys instead of password authentication.
Graceful & Minimal CSS design system in pure semantic HTML - picocss/pico
Browserpass web extension. Contribute to browserpass/browserpass-extension development by creating an account on GitHub.
Read Linux news from the most popular Linux websites in one place.
Accumulation of linux sources from all over the web.
Unfortunately not customizable, but a good starting off point to find some addresses to point your RSS reader to.
Nice markdown notes management / writing app.
Simple interface, reminiscent of old Evernote clients - but targeted plainly toward plaintext note-taking.
(In)famous manual for asking technical questions - what to provide, what to leave out, and how to phrase it.
Can, with some implementation changes, conceivably be adapted to many more situations.
Python wrappers for dynamic menus (dmenu, rofi, fzf, ...)
Greatly simplifies calling and working with menus through python.
Personal Photo Management powered by Go and Google TensorFlow - photoprism/photoprism
ā Periodic overview of recently added #ActivityPub projectsā¢ CastoPod Host: an open-source hosting platform made for podcasters who want engage and interact with their audienceā¢ GoToSocial: headless #Mastodon compatible Fediverse server project written in #Golangā¢ Tranquility: small ActivityPub server written in #Rustā¢ hacker-don: server and web frontend written in #Clojure, prioritizing user sanity, safety and privacyā¢ Kazarma: A #Matrix bridge to ActivityPubFor all projects visit:fediverse.party/en/miscellaneoā¦git.feneas.org/feneas/fediversā¦
A long list of data sources, divided by general topics, and of varying quality.
Another, more gentle and longer, introduction to the idea of python generators (yield
)
Different ways of listing group members on linux. Often hinges on distinction between primary and additional group in linux.
Easiest way (if shadow package is installed) groupmems -g <group> -l
.
Combine python, scipy and pygame to turn wallpapers into low poly art images and animations.
Can I just say as a neuroscientist this is not your fault. Basically we think we have control over what we do but this is an illusion. For example you want to work on your project but you never do. So then you feel shame/guilt etc which only makes you more unproductive.
The solution to this is that the mind behaves more like a computer than we think. If you know how to properly interact with it you can make it do whatever you want. Now there is a long list of behavioural psychology focused on productivity but I will start you of with one thing.
Right now create a list it can be on your computer a website like trello.com or on paper it doesn't matter. On it write 6 Things that you can accomplish very quickly in relation to your project.
for example the list could be this.
make a project directory for my project.
download the dataset needed
install required tools for project
write first variable
write first function
Make the first graph
Set the commitment to do just one of these things per day, you don't have to do anymore.
Try adding new goals to your list as you complete old ones.
the goals should be easy to achieve 1 minute - 30 minutes for each.
Pretty soon you will be doing more than just one task.
This method efficiently uses your brains reward system. Doing small clearly defined tasks with low commitment is easy and generally fun to do.
Doing a large complicated project with no clear approach is not fun to do.
There are tonnes of efficiency hacks and every person is different. Good luck.
Notcurses is a character graphics and TUI library; actually a promissing ncurses alternative. It was mentioned (1, 2) already in this subreddit.
Besides being a library it comes with some handy apps which you may find interesting:
- ncls: an ls that displays multimedia in the terminal
- ncneofetch: a neofetch ripoff
- ncplayer: renders visual media (images/videos)
- nctetris: a tetris clone
- notcurses-demo: some demonstration code
- notcurses-input: decode and print keypresses
Style your webpage like Edward Tufteās handouts.
Uses a variety of css rules (embedding stuff in e.g. article
, figure
, checkbox
tags) to emulate Tufte's visual design,
with margin notes, side-notes, and image positioning close to its text.
Some implementations seem a bit awkward (the way sidenotes are declared, the iframe wrapper introduced) and the link-underline text-shadow hack seems pretty bad (especially on restyling pages dark-mode), but it seems nice for inspiration.
Example page here.
Google photos alternative on device, only for mobile devices not self-hosted
Commandline client for python-bibtex and doi requests.
Could be useful either as-is or for inspiration for an own reference management software.
Hey folks!
I insert # TODO:
comments everywhere, may it be code or dotfiles, doesn't matter. So if I'm bored, I open a dotfile and search for TODO
to see them. Mh, not the best way to not forget them.
How would you do it to be reminded?
In IDEs, there's often a section where TODO/FIXME/etc lines are being listed, so you always see if there are some in the opened file (e.g. NetBeans does this).
Maybe have an indicator in the airline
that just is a red tag or whatever that says TODO
, so I know... Oh, I shouldn't forget!
How do you handle these comments?
reddit permalink
A quick overview on how to handle async processes in luv in Neovim.
Uses example of spawning a pandoc process, which is a good example starting point for reviewing implementations.
Intro In the past Iāve had to deal with ISPs blocking ports and in some cases most usable incoming ports. I want to show you how to bypass this using Wireguard and a VPS. That way you can start selfhosting services even if your ISP doesnāt want you to.
For this tutorial Iām going to be using a DigitalOcean VPS (their smallest one) but you can use any provider you want.
Ways of editing (internal) page numbering of PDFs
Pinata is an ipfs pin manager and helper, free to up to 1GB of data.
Can be used to e.g. pin a hosted website on the decentralized web for free, see here.
Program advanced features like opening youtube links in mpv, or a program just like this if already called from a terminal, or with a terminal window if not currently in one.
Seems really useful for advanced opening coding (e.g. fzf in floating mode from shortcuts, or just in term, and so on)
Generate nice previews for your nextcloud photos, and sort them automatically by date
Uses an .ics
file (or a simulated file, through a HEREDOC, see solutions here ) to send it through curl as a data stream and thus bring it into Nextcloud.
Written in python and intended for web scraping.
What's really nice is that is supports a lot of languages (english, french, spanish, russian, german, etc) without any need for switching in code.
Small (10kb gzip) css framework.
Describing itself as invasive, i.e. hard to drop into an existing project - but very easy to integrate into new ones.
Looks fun and light-weight.
Simple introduction to a bunch of argparse options, letting you create nice python cli applications pretty quickly (and more importantly rather painlessly, once you get used to it)
Ready-made stack with installation scripts, making use of docker to install a fully working and nicely integrated stack of containers on your Pi.
Adds additional niceties (like turning off swap, writing logs to memory) that improve Pi performance (and life-time of SD cards) - most of which can be adapted reading through the source.
Especially interesting for things like NodeRED which are not just interesting for IOT.
NEW Stack migrated here: https://github.com/SensorsIot/IOTstack
In this post, Iāll introduce you to task-spooler (ātsā for short), a Linux program that lets you queue tasks to be executed either sequentially or in parallel, according to a user-defined number of slots.
Extensive explanation of generators in python (yield
, etc.) with examples on top
Here's a quick guide on How to Set Up Multi-Factor Authentication for SSH on Ubuntu 20.04. Click here for a step-by-step guide.
Self hosted alternative to Google Photos . Contribute to LibrePhotos/librephotos development by creating an account on GitHub.
Nice cut alternative. Less portable, since cut is installed everywhere anyway but allows some really nice simple (or advanced) choices.
Especially nice to circumvent the cut troubles with repeated/irregular whitespace separation and similar issues.