mala::notes

Build Your Own Timeline Algorithm

A plot depicting a 2d cloud of points with different colors (blue, orange, red, light blue). The title of the plot is “Timeline visualization”. On the top right, a legenda matches each color with a different string (blue=home, orange=local, red=public, light blue=tag/gopher). Different areas in the plot are circled in yellow and matching descriptive yellow text is provided. Some examples are “The fediverse plain”, “The AI peninsula”, “The Billionaires swamp”, “The Gopher desert”, “The island of empty posts”, “Craft city”, and so on.

(this post was written as a follow-up to my FOSDEM talk, but of course, well, life happened ¯\_(ツ)_/¯ and TWO MONTHS passed since then… In the meantime, BYOTA has become a Mozilla.ai blueprint and you can now both play with its code and with a demo on HuggingFace spaces. While the Mozilla.ai blog post linked above describes the blueprint with higher detail, here you’ll find a more personal take about my project)

Timeline algorithms should be useful for people, not for companies. Their quality should not be evaluated in terms of how much time people spend on a platform, but rather in terms of how well they serve their users’ purposes. Objectives might differ, from delving deeper into a topic to connecting with like-minded communities, solving a problem or just passing time until the bus arrives. How these objectives are reached might differ too, e.g. while respecting instances’ bandwidth, one’s own as well as others’ privacy, algorithm trustworthiness and software licenses.

This blog post introduces an approach to personal, local timeline algorithms that people can either run out-of-the-box or customize. The approach relies on a stack which makes use of Mastodon.py to get recent timeline data (but this could be customized for any social network which still offers an API to its users), llamafile to calculate text embeddings locally, and marimo to provide a UI that runs in one’s own browser. Using this stack, you will learn how to perform search, grouping, and recommendation of posts from the Fediverse without any of them leaving your computer.

Let that sink in

Despite being published more than one year ago, the Nature paper “Drivers of social influence in the Twitter migration to Mastodon” by La Cava et al. seems so current when it says “After years of steady growth, popular social media are experiencing shifts in user engagement.” The paper was referring to the abrupt #TwitterMigration following Elon Musk’s acquisition of Twitter in October 2022, but similar events followed since then, the most recent one following Mark Zuckerberg’s declarations last January.

I joined the Fediverse around May 2022 and there I found some great communities of practice that resonated a lot with me. Each community had different size, characteristics, language, message frequency. Everything I read was because I chose so: I could see posts from people I followed, search by keywords or tags, or read local and federated timelines when I wanted to look for something out of my personal bubble. The only algorithm available was the so-called “reverse chronological”, with all its pros (no ads, only posts from people you follow) and cons (unless you read everything, you’ll likely miss some statuses published while you were asleep). That’s where I started to think about the idea of a “personal algorithm”: something that I could customize and experiment with, and then share with other people so that they could do the same if they wanted, or just use it as a black box as long as they trusted the source.

An apology of (some) algorithms

This is what the Build Your Own Timeline Algorithm (BYOTA) idea is about: it is not a new algorithm, an end-user application, or a service, but rather a call for action, and a playground where people can easily tinker with, create, and share their own algorithms. To build this playground, I decided to delve deeper not just into platform algorithms (which I had already experienced while working at Twitter), but also into existing projects developed on the Fediverse. If you want to have a clearer idea of how platform algorithms and in particular recommenders work, this blog post (part of an excellent series!) by Luke Thorburn is a great starting point. For what concerns algorithms running on the Fediverse, the following provide some good food for thought:

Every algorithm I saw, no matter whether running on a closed platform or the Fediverse, was affected by one or more of the following problems: bias; lack of privacy, transparency, and user control; dependency on ML algorithms which are complex, computationally heavy and require to run in a centralized way. Reverse chronological itself -which many welcome as absence of an algorithm- is indeed an algorithm and it is biased too, as it generally favors statuses by people who write more and in the same timezone as me. Another, more important, thing I realised is that, in practice, all the above are actual problems only when the objective is to retain users on one’s platform. We can break this assumption though, by deciding that we do not need to compete with commercial algorithms: this is not a mandate for distributed social networks, and serving people’s purposes is more than enough.

A presentation slide with the title "Breaking the assumptions". On the left, a bulleted list: "Choose one or more of: biased; not transparent; complete lack of control; not privacy preserving; huge computational cost; require deep ML knowledge; cannot be run locally." On the right, a matching bulleted list with arrows pointing from left to right: "Open, substitutable models; clear, simple methods; make code available/easy to run; don’t share personal data; use lighter models; out-of-the-box embeddings; build so it runs locally :-)". At the bottom: "Also true for search and recommendations!" and an arrow pointing to another text: "All the above work here too!""

When the objective changes, one can immediately realize there is a solution to each of the problems previously described. A solution that relies on open, small, interpretable models that can be run locally for ad-hoc tasks without requiring too much ML knowledge. One that only requires post contents to provide semantically relevant search results and recommendations, by relying on sentence embeddings.

BYOTA’s building blocks

BYOTA is built on this assumption: do something useful while preserving user privacy and control, using open models and local-first technologies. While the concept sounds quite straightforward, developing something which is easy to use and customize, and at the same time computationally light and useful, is not as simple. Luckily, there are some great tools we can rely on for this: Mastodon.py, llamafile, and marimo.

Mastodon.py (MIT License) is a python client library for Mastodon. You’ll use it to download new posts from your instance’s timelines. As what you need are just status contents, you can swap this library with anything else that provides you with similar short texts: a generic ActivityPub client, a specific one for a different social network’s API, a browser extension, an RSS feed reader, and so on. You can also use a mix of these if you want to play with cross-platform search and recommendations.

Llamafile (Apache 2.0 License) is a Mozilla tool that packages a language model in a single executable file that will run on most platforms. It is 100% local and has been optimized to run on slower hardware, from Raspberry Pis to my 9-years-old laptop. It is based on llama.cpp which supports a plethora of models, not just LLMs. In particular, I chose all-minilm because it is a sentence transformer model, specialized in calculating embeddings from text. It is tiny (50MB) and has open code, research papers, and datasets. Note that llamafile is a swappable component as well: if you already have a favorite embedding server, you can use it instead of llamafile as long as it is compatible with its API (and even if it is not, writing a new adapter is quite simple).

Now, let’s talk embeddings: if you don’t know what they are, you can just think about them as numerical descriptors of your Mastodon statuses, which are closer -as in two cities’ coordinates being close on a map- the more semantically similar their respective statuses are (see the “Mastodon map” figure below). If you want to learn more about embeddings, check out Vicki Boykis’ excellent paper.

Marimo (Apache 2.0 License) is a reactive notebook for Python, that is also sharable as an application. It mixes code, markdown and UI widgets in the same interface so you can (1) develop as you would do with other notebook environments, (2) share it as an application by hiding the code and only displaying the UI components, (3) allow people to build their own UI on a grid-based canvas, fully customizing their experience. Above all, marimo relies on WASM to run everything inside one’s browser. This is what that means in practice: you can download a marimo notebook from its repo, install its python dependencies, and run it locally as any python notebook; but you can also deploy it as HTML+Javascript files, host it somewhere super cheap (because the server will not run any of your code), and people will be able to run it in their browser with no need to install anything else.

A screenshot showing a python notebook that includes both UI elements (forms, plots, and dataframes) and code cells

What you can do with BYOTA

When you run the BYOTA notebook, you first log into your mastodon account and download statuses from a few chosen timelines (you can find more information about the initial setup in the README.md file). After that, the first component you can play with is the one visualizing post embeddings.

In the figure right at the top of this blog post you can see a 2D plot of embeddings collected from four different timelines: home (blue, only people I follow), local (orange, all posts from my instance, which is fosstodon.org), public (red, federated posts from people followed by users of my instance), and the timeline that I got searching for the #gopher hashtag (light blue). Using the visualization tool, I have manually selected, analyzed, and annotated a few areas of this map. Semantically similar statuses will always be close to each other regardless of which timeline they appeared in, so from this view you can follow similar content across different timelines. This is simple to grasp and easy to interpret, as it is based on the actual posts’ content.

When semantic similarity is encoded by embeddings, one can easily develop semantic search by distance: a search field in the notebook allows you to provide a status ID (which you can get by selecting a post in the map) and look for the statuses (default is top 5) which are most similar to it. You can also just make up a sentence that describes what you are interested in and get results from different timelines.

A talk slide titled “Semantic search”. The slide contains an image showing a form with a text saying “Enter a status id or some free-form text to find the most similar statuses”. The content of the text box below that is “I am a fan of free software and everything opensource”. Below that, six posts are shown in a three-column table: on the left some IDs, in the center a column titled “label” shows values such as “public”, “local”, “tag/gopher”, and in the right a column called “text” show the content of some statuses, containing text such as “Software Libre”, “Visit @openSUSE this coming weekend for all things #opensource at #FOSDEM”, “Best cross-platform apps for #linux”, “The value of Open Source Software”, etc.

Another available feature is post re-ranking. To perform this, each post in a given timeline is assigned a score that depends on its similarity to another whole set of posts. For instance, you can re-rank the latest 800 statuses from your local timeline by:

The interesting part here is that the choice of which timeline to sort is yours: you can surface posts closest to your interests just from people you follow (home timeline), from users of your same instance (local timeline), or from other instances (federated timeline). You can also re-rank the results of a keyword or hashtag search.

Another thing you are in control of is what you use to provide the “style” for your re-ranking. Your last 50 posts is just an example, but you can pick different sets depending on what you are looking for at any given time. Basically you have a content-based re-ranking of any timeline: you basically apply the “style” of a set of statuses to another list, by putting on top those statuses which are overall closer to the ones you provided. As an example, the figures below show the same list of messages I wrote on Mastodon, re-ranked according to the “style” provided by the #gopher and #bookstodon hashtag search results.
My own posts, re-ranked according to their similarity to posts in #gopher.

My own posts, re-ranked according to their similarity to posts in #bookstodon.

Does this really work?

One of the main issues when building open tools is ending up with something that people would not eventually use. For algorithmic timelines, in particular, the main concern I saw that prevented the adoption of new tools was privacy: people did not want their personal information to leave their own computers, nor anything they had already made public to be used by some centralized algorithms. The mandate then becomes to run everything locally, with the two very natural questions: “Is this really local?” and “Does this really run on my computer?

“Is this really local?”
The answer is: check yourself. I ran the embedding server on a plane (and no, I am not paying for that poor wifi!). And for the notebook, you can verify with your browser’s developer tools that the only remote connections you’ll have to do are those related to the marimo and WASM dependencies at bootstrap time, and those required to download posts from your mastodon instance using the timelines API. You are always in control of which and how many messages you download before running the algorithm. From then on (embeddings, plots, search, re-ranking) everything runs on your device. The most interesting part is that it does not have to be local, but you are in control of it: for instance, if you cannot run a local embedding server you can rely on a remote one, choosing which software and model it will run and where. You can run the python code on a server in your local network or in the cloud, if you trust your provider, or you can simply host the HTML/WASM version of your notebook and let clients download timeline data and run the algorithm.

“Does this really run on my computer?”
About two years ago I tried embedding all the messages I could download from my Mastodon timeline using the tweet-topic-21-multi model, and I hit a wall when I realised it took three full minutes on my 2016 MacBook Pro. So, when I had a chance to run optimized, llama.cpp embedding servers, I wanted to repeat the same kind of experiment: I embedded 800 statuses using four different models, from the 22M-parameters all-MiniLM to the 7B e5-mistral. I tested them with two different local servers, llamafile and ollama, and two different laptops, my 2016 MacBook Pro with Intel CPU and my 2024 one with M3 Max. The results are shown in the table below. The summary is: it can work on older hardware, and on recent one you will barely notice the calculation overhead.

All-miniLM took 11 sec on M3, 52 sec on my 9yo laptop. And the embeddings are already good! Larger models might provide extra perks (which I have not investigated yet) but at the price of higher compute. Interestingly, despite the fact that both ollama and llamafile are based on llama.cpp, ollama seems to be faster on newer hardware / smaller models, while llamafile becomes a better choice for larger models on older hardware. Finally, I used default parameters for both servers so there is definitely space for improvement.

Conclusions

Given how custom the Fediverse experience is, I do not really think everyone needs a timeline algorithm. What I believe, though, is that if someone ever needs one, then BYOTA can be a more than decent starting point. This is definitely still a work in progress with many possible directions to grow, among them the ones I find most interesting are:

… But above all, I’d like to hear from you! If you are interested in this topic, you can also check out my talk at FOSDEM or its Mastodon thread companion and play with the latest code. If you have any comments or questions, feel free to reach out at my_first_name at mozilla.ai or -obviously- on the Fediverse ;-)