Implementing Property Searching for Micropub
One of the proposed extensions to Micropub that I found fascinating is an extension to querying for a post list. It'd allow one to find a list of posts in their Micropub installation with any sort of querying. Koype currently supports looking up the MF2 of channels, categories, and entries. However, something I really wanted is the fields mentioned by Grant, the ability to filter over the properties of entries. This was something that would allow me to check if I've already interacted with something in my proposed social reader. Implementing this was not easy, though. I had to make a virtual table in SQLite, scan the source MF2 of all the entries on disk that are queried for (which was all, most of the time, in local tests!) and add two custom functions to SQLite to properly look for the wanted values in the right places.
Implementing the Virtual Table for Querying
I've read that SQLite's documentation is good. I did not find it to be the case. It's mainly because this begins to get into the plumbing of SQLite, and I'm still learning my way around it. Likewise, I also tend to peek into the source code, but reading C is not something I have done in a long time—I most definitely need a refresher. Instead, I leaned on the example code provided by the Rust library I'm using to interact with SQLite. Reading that and leveraging the documentation for virtual tables helped me get to a working table implementation. The goal was to provide a mapping between the properties stored on disk for an entry and its corresponding ID. This enabled queries like the following:
JOIN epv on epv.uid = entries.uid
mf2_json_has_value_in_property(epv.properties, 'like-of', 'https://lobste.rs') IS TRUE
ORDER BY entries.published_at DESC
This lets me look up any post that has a link to Lobsters as a like. What's not described by the name is that this function checks if the value of the property is either equal to or begins with the provided string. This kind of query could be translated to something like the following:
This kind of query I'd have to be careful with. If I'm looking up a URL, I'd probably want it to match against the authority value, but that'd require parsing every string to see if it's a valid URL and then doing that match. However, I now have a way to look up every link from Twitter that I've liked, replied to or engaged with. I have a ticket to see if I can allow for some sort of hint to allow for full-text searching of properties—that would allow me to search the contents of things I've posted, so I can check for deep links.
The concept of “virtual tables” aren't unique to SQLite (see Postgres's wiki or MariaDB's knowledge base for more info). However, what I'm doing should definitely make your nose turn up a bit—I'm reading any number of JSON files from disk (about 2,100 for my site at the time of writing, with about 5 – 15 new posts added to my site each day) each time a request is made to my homepage; making it to a DoS just by hitting Refresh in your browser a few times! I've added logic to short circuit the request on the site and applied an eager connection timeout in SQLite (I set it to five minutes before—don't ask) so it'll just return empty lists (at best) if it doesn't resolve it in time. I'd love ideas on caching or optimizing how I've done the lookup as well as even storing the information for entry properties. A bit of the conventional knowledge I have around these approachs are a bit moot since
See it live!
My website uses Koype as its CMS. As of v0.1.4, this has been available. It powers the feed of items at the bottom of my homepage that shows things I've been doing on GitHub from my site, as well as the things I've interacted with on Twitter. I'm hoping to tinker with this more overtime to see what other kind of queries I can build. I'm hoping more people implement such a query because it provides a cheap affordance (confirmation of pre-existing data) that can be helpful in social readers.
The astute computer scientist in you probably noticed that this kind of solution wouldn't work in constant time. A poor implementation could actually lock up my site (databases tend to)! I have logic for timing out database calls, though. I'm also curious about adding some benchmarks to see if my naive approach works faster than something like JMESPath. Frankly, I'd love to have something like JMESPath ship as an extension to SQLite's existing JSON methods.
Some other changes are that the indieweb Rust library supports capturing these properties (as well as the exists and not-exists fields) so any client or server using that library can pluck them out!
Engagement is powered by Webmentions — a premier standard of the Web to let other sites know you've mentioned them. Learn how to reply from your own site. or from a supported silo Aaron has an interactive post about this. If you've mentioned this URL via another one, use the form below to submit it.
If you don't currently own your replies, then you can click below to do so.