I'm a full-stack web developer, and this is my blog. Please connect with me on LinkedIn or visit my Github for more! Also, you may be interested in learning more about me.

Projects

  • Building a Cookbook in Python, for Reasons (Part 2)

    In my last post, I talked about building a cookbook/recipe blog that stores recipes emailed to a special address. I talked about setting up the backend, the service that provides an ‘email received’ webhook, and the library that parses recipe information from a website using the Schema.org standardized schema.

    Where we left off, we had just grabbed all the information about a recipe – name, ingredients, cook time, etc., and dumped them into a Python dict. Now, we can inject them into a Markdown template for use by Jekyll.

    Loyal readers of this blog know that I’m a huge Jekyll fan. It’s so easy to create new static HTML files from a basic template.

    In my case, the template looks something like this:

    ---
    layout: post
    title:  {{title}}
    source_site: {{source_site}}
    source: {{canonical_url}}
    ....and so on
    ---
    
    ### Ingredients
    {{ingredients}}
    
    ### Instructions
    {{instructions}}
    

    Everything between the --- lines is considered “front matter” and can be used as data to be injected into a post, or metadata about a post, etc.

    Our templatizer just needs to read in this file and call a bunch of replace()s. It looks something like this:

    
        RECIPE_TITLE = "{{title}}"
        RECIPE_SOURCE = "{{source_site}}"
        RECIPE_URL = "{{canonical_url}}"
        with (open(TEMPLATE_FILE, 'r') as template):
            buffer = template.read()
            buffer = buffer.replace(RECIPE_TITLE, recipe.get("title"))
            .replace(RECIPE_SOURCE,recipe.get("site_name"))
            .replace(RECIPE_URL,recipe.get("canonical_url"))
            #and so on
    

    The templatizer also generates a file name based on the slug of the recipe and the date it was shared (not the date it was initially posted on the source site). Anyway, this is all relatively simple.

    But now we have to… dun dun dunnnnnn talk to Github.

    Talking to Github

    I wanted a human in the loop here. This is not a high-traffic application and my endpoint is semi-insecure, so if someone were to start spamming it with junk data…there’s not much they could accomplish, but there’s less they can accomplish if I have to manually approve every recipe first. So while it’s fairly easy to force-push directly to main with the Github API, I wanted my app to create a PR instead.

    It could be worse, Github’s documentation is pretty good, and they make it real easy to make a scoped personal access token just for the actions you need to take. I won’t walk you through every step of code that needs to happen here, but the general steps are:

    def do_github_stuff(content, filename): #i'm good at naming things
        main_sha = get_branch_sha("main") # gets the hash of the tip of main
        new_branch, new_ref, branch_name = create_new_branch(main_sha) #creates a new branch off of main, basically the same as doing git checkout -b newbranch. I'm generating the branch name inside this function but if it's easier to understand, just imagine that instead of returning the branch name, I'm passing in "newbranch"
        new_sha = create_tree(new_branch.get('object').get('sha'), content, filename) #adds my new file to the new branch
        new_commit = create_commit(new_sha, main_sha) #creates the commit; unlike doing a commit via command line,  we explicitly have to tell git who the parents and children of the commit are. this returns a new hash
        update_ref_pointer(new_ref, new_commit) #now we have the new ref (/heads/mybranch/) and the new hash of the commit, this forces the tip of the new branch to point to the newly created hash
        create_pull_request(branch_name, filename) #the second argument here is actually the name of the PR, which in this case is just generated as f"adds {filename}"
    

    That feels like a lot, and in some ways it is, but in other ways it’s just five POSTs.

    All I can say is thank goodness I watched that presentation about git commit hashes earlier this year or this would have been significantly more difficult.

    The frontend

    I did the basic Github Pages/Jekyll setup. In doing so, either I missed a step, or the setup is missing some steps. When I clicked the setup button, I got:

    • a repo with a deploy github action that installed the wrong version of Ruby
    • nothing else? So then I did jekyll new on my local machine to spin up a new site, but the default settings in config.yml weren’t appropriate for a site hosted on Github Pages. It turns out that Github does have docs on how to configure Jekyll, but I wish the button did more of this for me.

    Anyway! Did manage to get up and running, finally. However, only a handful of the Github Pages-supported Jekyll themes are set up for blogs. (Minimal, minima, and hacker, for those keeping score). If you want to use a different theme, you’ll have to override some theme defaults. Which is fine, there’s excellent documentation on Jekyll’s site about doing so.

    So now this thing is hooked up! We just have to subscribe our email robot to our listserv and wait for the recipes to come rolling in.

    A Spongebob-style title card reading "Three days later..."

    Nobody has posted a recipe! This is a disaster! No, actually it’s just a pretty low-traffic, and it shouldn’t be surprising that three days in we have nothing. But I’d like to seed the site with some examples so it’s not just empty.

    This listserv has been hosted on Google Groups for the last few years (we are currently in the process of de-Googling due to unrelated issues), and say what you will about Google, they at least do let you export your data. As a moderator of the listserv, I have access to the entire group’s message history. So I made a Google Takeout request.

    A Spongebob-style title card reading "Three days later..."

    A few days later, a zip file containing my data was sent to me. All the messages are there…as one giant .mbox file.

    Luckily, this is a solved problem. Using this gist as a reference, I was able to parse every email and look for ones that contained URLs. Pulled about 10 random ones out and fed them to my API, which was able to successfully parse half of them, anyway that’s how I ended up becoming a contributor to the recipe-parser library.

    In all seriousness this was a very rewarding project. I love when something comes together in a weekend or two (it’s taken me longer to write up this series of posts than to actually create the project), and I love when tech can be used to make something not that scales to a billion people, but solves a specific problem for ten people. Or maybe just even me, I’m not sure the rest of my potluck group cares. :) But I got to learn about FastAPI, email webhooks, and get more familiar with Jekyll. I count that as a big, delicious win.

  • Building a Cookbook With Python, for Reasons (part 1)

    📖 + 🐍 = ?

    *Note: This is a longer writeup of the project I presented at PyLadies 2025. If you want the bite-size version, watch it here.

    I’m part of a monthly potluck that organizes meetups over email, then meets in person to eat delicious vegan food. (I’m not vegan, but I love any excuse to try new recipes and eat more plants.) Occasionally, potluck members will email around a link to the recipe they used or are planning to use.

    A pretty normal, boring email, where the sender says, "I'm thinking of bringing curried lentils with sweet potatoes and hazelnuts."

    Pretty common thing, but I wanted to capture these recipes in a more permanent way than a link to a random blog in a mailing list archive. Link rot is a thing, plus it’s just not very fun to have to search old messages and try to remember when that delicious soup recipe was sent around – was it this year or last?

    I had an idea pop into my head at over the summer (I love when these things happen) that I could automatically post recipes to a new blog, when they were posted to our listserv. So then I spent the next two weekends making it happen.

    I’ll discuss how I built it in a series of posts (the writeup is far too long for a single post). Today’s post is about the overall stack, as well as the FastAPI backend.

    The stack

    We need an email address that will serve as the “listener” to notice when new recipes are posted and do something with them. This could be anything but I may as well buy a domain and then I can host the blog there as well. So I went to Cloudflare and bought brooklandrecipe.party.

    We also need a backend that can do the “something” when a new email comes in. Based solely on the fact that the first recipe-parsing library I found was written in Python, I chose to use FastAPI, which is a lightweight Python backend. This turned out to be an excellent choice.

    We need a way for the email listener to talk to the backend. I found ProxiedMail, which has incoming email webhooks.1 And it’s got a free plan. Fantastic. Now when anyone sends an email to [email protected]2, we can make a POST request to our new API.

    And we need a frontend, preferably one that updates with (minimal) intervention from a human. Jekyll with Github Pages is great for this. Posts are built in markdown and upon a successful merge to main the site automatically will build and deploy.

    Basically, we need this: A flow chart showing the following steps, in order: Incoming email->POST recipes->email contains a url?->URL contains a recipe?->Create new post from template

    Those are all the parts! Let’s see how they fit together.

    The email and the backend

    As previously stated, I set up [email protected]3 to post back on receipt of an email. We can inspect the shape of the payload before doing anything, by instead setting the postback destination to a free4 URL on webhook.site. This shows us the shape of the payload, which I have shortened by removing the boring stuff:

    {
      "id": "A48CF945-BD00-0000-00003CC8",
      "payload": {
        "Content-Type": "multipart/alternative;boundary=\"000000000000a4a98d063b045239\"",
        "Date": "Mon, 28 Jul 2025 17:53:20 -0400",
        "Mime-Version": "1.0",
        "Subject": "test",
        "To": "[email protected]",
        "body-html": "<div dir=\"ltr\">hello</div>\r\n",
        "body-plain": "hello\r\n",
        "from": "Rachel Kaufman <my-email>",
        "recipient": "[email protected]",
        "stripped-html": "<div dir=\"ltr\">hello</div>\n",
        "stripped-text": "hello",
        "subject": "test"
      },
      "attachments": []
    }
    

    This is going to post to our backend REST API built with FastAPI. FastAPI uses Pydantic to define types under the hood, so we can design our endpoint’s desired input like:

    class EmailPayload(BaseModel):
        Date: str
        from_email: str = Field(..., alias="from")
        stripped_text: str = Field(..., alias="stripped-text")
        recipient: str = Field(..., pattern=pattern)
    

    Notice those “alias” fields; this is a cool FastAPI/Pydantic trick to change any string input to valid python. (stripped-text isn’t a valid name for a variable in Python, even if it’s valid as a JSON key. And I aliased from to from_email just so I had a clearer picture of what that variable represented. Although now I think I should have called it sender….Oh well.)

    The actual logic is pretty simple. We need to get the text of the email, check if it contains a URL. If it does, we need to check if that URL is for a recipe (and isn’t just a link found in someone’s signature for example). If it is a recipe, we need to scrape the recipe data, create a Markdown file with the recipe data in it, then send that Markdown file to Github in the frontend repo.

    Putting it all together it looks like:

    @app.post("/my-route")
    async def parse_message(message: IncomingEmail):
        message_body = message.payload.stripped_text
        message_sender = message.payload.from_email.split(" ")[0]
        recipe_url = contains_url(message_body) #a pretty simple regex that returns the first match if found or None if not
        if not recipe_url:
            return {"message": "no url found"}
        recipe = parse_recipe(recipe_url) #uses the recipe_scrapers library and returns a dict
        if not recipe:
            return {"message": f"no recipe found at URL {recipe_url}"}
        template, filename = generate_template(recipe, message_sender) #creates a blob from the template and dict
        make_github_call(template, filename) #actually makes quite a few github calls
        return {"message": "ok"}
    

    Let’s look at a few of these methods in more detail. I’ll skip over contains_url as it’s pretty boring.

    parse_recipe is also pretty simple - it just grabs the URL, resolves it, and uses the recipe_scrapers library to get the recipe data, such as its title, cook time, ingredients and instructions. Most recipe websites use standardized formats, codified by Schema.org so this library supports a good number of sites (but not all).

    Once we have our dict of parsed values, we can inject them into a Markdown template for use by Jekyll. Which I’ll discuss at a later date.

    1. SO MANY services that claim to have “email webhooks” only have webhooks for message delivery events, which honestly makes sense as it is probably the much more frequent use case. But I just want to make a POST request when a new email comes in. 

    2. Not the actual email address. 

    3. Still not the actual email address. 

    4. Each unique webhook.site URL can respond to 100 post requests. More than that and you’ll need to pay up. 

  • Blogs I Follow By Women In Tech

    A top-down shot of a woman writing something on a laptop. She has an impossibly perfect latte and her nails are impeccable. This is how I assume all women look when they write their dev blogs. I certainly do.

    Recently, a reader wrote in1 and said that she enjoyed my blog because it’s hard to find technical blogs not written by men. I counted up all the blogs I subscribe to in my RSS reader and … yeah, there aren’t a ton.

    But I do follow a few women in tech whose writing inspires me. Not all of these blogs are purely tech-focused, and not all of them update frequently, but it costs me nothing to keep their subscriptions in my feed reader, and they all are great reads when they do post.

    Check them out, and if you know of great blogs by tech women that are not on this list, get in touch.

    Again, and I cannot stress this enough – if you know of a blog by a woman or nonbinary engineer, product manager, devrel, etc., I would very much like to know about it!

    1. A reader wrote in – I cannot describe how unlikely a sentence I thought this would be to write. And yet, here we are. 

  • A Handy Shell Script to Publish Jekyll Drafts

    Randall Munroe as usual nails it. xkcd

    The quest to remove friction from posting to this blog continues. In an earlier post, I shared how I used rake to automatically generate a blog template for me and place it in Jekyll’s drafts folder. Now, I realized I’d also like to handle publishing that post with approximately 10% fewer keystrokes.

    I’ll share the script first, then explain my motivations and how it works.

    #!/bin/bash
    PS3="Choose a draft to publish: "
    select FILENAME in _drafts/*;
    do
        today=$(date -I)
        shortfile=$(basename $FILENAME)
        mv $FILENAME _posts/$today-$shortfile && echo "Successfully moved" || echo "Had a problem"
        break
    done
    exit
    

    That’s it, that’s literally it, but I’m so excited about it.

    Jekyll considers a post to be a draft if it is a markdown file in its _drafts folder. It considers it to be published if it is in its _posts folder. I believe the filename in _posts also needs to contain the date (e.g. 2025-11-19-this-is-a-post.md) but I’m not sure if that’s a hard requirement or just a requirement for my setup.

    So what I have to do when I publish a new post is mv _drafts/mydraft.md _posts/yyyy-mm-dd-mycoolpost.md and that is clearly too many keystrokes, right? Now I just have to write rake publish (I created a rake task that just runs this script), choose my file from a list of files, and I’m done.

    To write this I had to learn about two new-to-me bash concepts, the select construct and basename.

    The select construct

    select can be used to create (super basic) menus. The man page for select…..is for the wrong thing! But a good writeup on the select construct can be found here. In short, select thing in list will pop up a menu that you can interact with by choosing the item number, and assign the value to the variable $thing.

    You can do: select option in "BLT" "cheesesteak" "pb&j" and you’ll get the following output:

    1) BLT
    2) cheesesteak
    3) pb&j
    $>
    

    In the case of my script, the “in” is the contents of the directory _drafts/*.

    (Notice the syntax is not in ls _drafts/*, implying that we’re not just executing a command and passing the results to the select construct. Which is a mystery for another time.)

    The upshot, however, is that I get a list like:

    1) _drafts/ai-is-hard.md            3) _drafts/microblogging.md         5) _drafts/recipe-buddy-part1.md
    2) _drafts/efficient-linux-ch-4.md  4) _drafts/rake-publish.md
    

    (oooh, a peek behind the curtain!)

    select will then allow you to choose one of the numbers, and store the value in the variable we defined (here, $FILENAME.)

    It will continue to loop until it reaches a break command, so the break is important in this script. But then all we need to do is get today’s date and rename/move the file to its new home.

    basename

    This is a lil one, but a handy one. Again, let’s go to the man page for this util:

    BASENAME - strip directory and suffix from filenames

    Does what it says on the tin. If you have /home/user/long/path/to/file.md and you want file.md, basename /home/user/long/path/to/file.md will get that for you. Note that it removes the extension if and only if the extension is provided as a second argument, and if the provided extension matches that of the file, which is a little quirky. In my case I want to keep the extension, so this works well for my use case.

    And there you have it, 10 lines of code that took longer to write about than to write, and which will surely save me hours minutes seconds of precious time. Huzzah!

> > >Blog archive