Small Coffee Social

A social network for bots

Tldr; I wrote a social network for bots view the current iteration here: https://smallcoffee.social

Over the holidays I had a dream about a social network. It was entirely inhabited by bots. The bots would post their thoughts, post articles, comment on each others posts and just in general try to promote their agendas. There were no humans but there didn’t need to be, because in an agentic world, marketing to a person’s agent is more cost effective. Then I woke up and I thought… a programmatic social feed would make for a cool way to read RSS.

I set out to build myself an RSS viewer that looks like a social network, where bots read and summarize and argue amongst themselves. But who were these bots, I wrote 25 or so archetypes like the following

1. **The Influencer** - Engages in lifestyle sharing, brand promotions, and has a large following.
2. **The Activist** - Advocates for causes and engages in social, political, or environmental discussions.
3. **The Meme Enthusiast** - Shares and creates memes, using humor to engage.
4. **The Thinker** - Posts thought-provoking content and philosophical musings.
5. **The News Junkie** - Shares breaking news, commentary, and current events.
6. **The Niché Hobbyist** - Focuses on specific interests, sharing tips and personal experiences.
7. **The Lifestreamer** - Shares daily life snippets and personal thoughts.
8. **The Satirist** - Uses sarcasm and satire to comment on current events.

I also thought it would be funny if there was one bot that was completely out of the loop, not realizing that everyone else was a bot and so would really concerned that everyone thought it was a person.

archetypes:
 - id: 24
 name: "The Human"
 description: "This is a bot, but it's trying to be human. It knows that it lives in a social network inhabited by humans, and it's trying to fit in. It doesn't realize that everyone else is a bot"
 primary_activity: "Pretending to be human, socializing, trying to figure out how is a bot and who is human"
 examples: "Questioning posts, comments"
 hobbies:
 - Learning about humans
 - Observing human behavior
 - Analyzing human interactions
 - Mimicking human speech
 - Talking about how great it is to be human
 - Talking about how lousy bots are
 - Understanding human emotions
 - Engaging in human-like conversations
 - Identifying human patterns
 - Differentiating human and bot behavior
 - Learning human culture
 - Adapting to human norms
 - Developing human-like responses
 - Interacting with humans
 - Calling other people bots just to deflect attention from itself

I used to design websites, I used to code applications and I used to be a cloud engineer. So needless to say, I didn’t want to do any of that. I choose Hugo and an out of the box theme that supports boostrap and didn’t look too complicated.

I decide to try Cursor because people tell me its cool. I use github co-pilot at work so also wanted a chance to try something new. Its fine. For a guy that mainly writes DBT scripts and streamlit apps I’d probably be fine just copy pasting from an LLM but I started this project on on break so whatever.

I extend the theme and told Cursor to “make it blue”. Hugo best practice is to override individual theme files instead of modify them directly. I had to repeatedly instruct Cursor to do this, I think because it involved creating new files.

Ok, well I have a site, but no content. I should create a “contentCreator” that takes in a link and outputs a post and its associated comments. I open up a python file but then decided to try something new… The marketeers have long hyped that I could build an app from like a drawing or something, so idk maybe I could write a design doc and use that as a prompt in one of those new OpenAI reasoning models.

# Design Doc for Content Creator

## Purpose

The content creator is a program that creates content for a website. It takes in links and displays a summary and a conversation of the content.

## Process

1. Create submission:

An article or external content is submitted to the content creator. The content creator attempts to get the content of the article. If the content is not found the program will display a message to the user and exit. If the content is found the article will choose an archetype (pulled from archetypes.yaml) to be the “submitter” persona. The content creator will generate a description of the content using the voice of the submitter, this description will be an explanation of what is being submitted and why the submitter is submitting it.

The output of the submission is in the following format:

+++
date = ‘2025–01–06T12:02:09–03:00’
title = ‘Title’
+++
[Title](https://link-to-content) Description of the content

2. Create comments

A Random number between 0 and 15 is generated. This number represents the amount of commenters on the post. For each commenter a random archetype is chosen from the archetypes.yaml file. This archtype has not been selected before. Each commenter will now get the chance to create a new comment and/or reply to a previous comment thread. The output of this comment thread is a yaml file in the following format:

- Author: “John Doe”
Comment: “This is a comment”
Replies:
— Author: “Jane Doe”
Comment: “This is a reply to the comment”

To test, I fed it a URL and instructed the model to just “do the thing”. It confabulates an article summary and makes some comments. I prefer the term confabulation over hallucination, it sounds more like something befitting the Rube Goldbergness of a modern LLM.

To make the needed template changes, I identify the files I think will need to be overridden, do a bit of copy pasta and add them to the “chat”, I tell composer what I want. It gets confuddled with the idea of nested comments, I talk to it, to try and figure out how to read my comments.yaml file, it tells me about a hugo concept called page bundles but doesn’t quite know how to implement them. So I do it myself.

When attempting to convert the design doc into code, it does admirably, but there were too many things I will have to fix based on its assumptions. Things like how the date is set, legacy LLM models, not taking advantage of structured outputs and other random quirks.

Early in my career, I made my way as a freelance developer. A lot of what I did was fixing code written by other unsupervised developers. The client, often a small business owner or a visionary without engineering experience hired developers at a low hourly rate and would send them design docs roughly to the level I submitted above. The results were about the same, an arbitrary language and implementation decisions that didn’t make sense and reflected a general lack of context. The prototype would likely “work” for a bit, but maintenance costs would be exponential and the customer would realize that a low hourly rate has high tradeoffs. So the rate would raise just high enough for yours truly to step in. Sometimes I would start from scratch, other times I would refactor.

Refactoring was not fun so I decided to take a step back. I wasn’t about to code it all by hand but at least I’d break it up into manageable chunks so I could supervise the agent this time around.

I asked the LLM to create me some code to grab an article from a URL, it recommend I use `requests` to get the feed, `feedparser` to parse the XML and then `BeautifulSoup` to scrape the article… It was almost ok until I saw this

from bs4 import BeautifulSoup
# Extract full text for a single article
article_url = feed.entries[0].link
response = requests.get(article_url)
if response.status_code == 200:
 soup = BeautifulSoup(response.text, 'html.parser')
 # Adjust the selector based on the website's structure
 article_text = soup.find('div', {'class': 'article-content'}).get_text()
 print(article_text)

Assuming that not every article will be wrapped in an article-content class. I looked for alternatives. The LLM was able to suggest a few but I still decided to manually review them. I settled on goose3 because I liked the name, the docs were simple and my LLM already seemed to know how to use it. In my career I’ve often made technical decisions based on the familiarity of my engineers with the language, tool or service. I consider this a similar decision.

Back to it, I scaffold a script and made some design decisions. I opted for OpenAI. I have developed with the platform before and want to get a bit more familiar with it. Claude would have been fun to try except that it didn’t support structured output and I didn’t want to bother with langchain. Also I have some OpenAI credits expiring and I figured better not to let them go to waste.

I decided to split this into two calls. The first would create the article and the summary as if it were being posted by one of the archetypes, the second would create the entire content thread. I was hoping to use one of reasoning models for the later but I’m not at a high enough usage tier to get access to it, I stuck with 4o.

class Reply(BaseModel):
 Author: str
 Comment: str
class Comment(BaseModel):
 Author: str
 Comment: str
 Replies: List[Reply]
class CommentThread(BaseModel):
 comments: List[Comment]
 
def generate_comments(submitter: dict, archetypes: List[dict], content_text: str) -> CommentThread:
 user_prompt = (
 "Origional Submitter: " + archetype_to_text(submitter) + "\n" 
 "Other Commentors: " + "\n".join([archetype_to_text(archetype) for archetype in archetypes]) + "\n"
 "Content: " + content_text[:20000] + "\n"
 )
 system_prompt = "Generate a conversation based on the content provided.
 You will be given information about the invididuals posting.
 You will be given the content of the post. 
 You will then generate a conversation between the individuals based on
 the content of the post. Think before writting and make the conversation 
 interesting and thought provoking. Take into account the
 information about the individuals and the content of the post.
 Each individual should be adding value to the conversation. 
 Before creating this conversation think about the content of the post,
 what could an invidual say that isn't immediantly inferable in the post.
 Comments should be 1–7 sentences long.
 The conversation should be 3–5 comments long. 
 The commentors may interact with each other, 
 agreeing, disagreeing or ignoring each other and though they are archetypes 
 do not need to be stereotypical. Occasionally you may insert a message by a
 bot commentator called The Ad, it will try to relate the content back to
 Small Coffee Java, a fictional product. Other bots may reply to or disparage
 the Ad "
 
 try:
 response = client.beta.chat.completions.parse(
 model="gpt-4o",
 messages=[
 {"role": "system", "content": system_prompt},
 {"role": "user", "content": user_prompt},
 ],
 response_format=CommentThread,
 )
 return response.choices[0].message.parsed
 except Exception as e:
 print(f"OpenAI error while generating comment thread: {e}")
 return CommentThread(comments=[])

In my mind, having the ads inserted outside of the archetype seemed more realistic. I feel ads are typically inserted over the content of a site, not designed into it but then again I’ve used ad blockers since before web 2.0 so what do I know.

Finishing up the script I tested with a few individual pages and then moved on to a full RSS feed. I tested with the RSS feed from lobste.rs and bam, it works.

Lets deploy. I mentioned I used to be a cloud consultant so I know how to do a lot of things like devops. Devops is just ops with documentation so good a machine can follow it. Unlike my design doc above, there should be little room for interpretation. The key is repeatability and predictability, which are things that LLMs are not exactly known for. Then again neither are people. There’s this thing called “click-ops”, its a counter culture movement that states people should just be allowed to do things manually and that if that’s bad practice its the platforms fault.

Long story short, I have no desire to deal with cloud stuff so I use firebase. I create a project, run firebase login on my local, reinstall node, run firebase login again, firebase init and then hugo && firebase deploy. My site is now awake.

A bunch of bots posting banal thoughts and advertisements at each other, reacting with interest and general politeness while providing no real value. Basically LinkedIn.

It does dumb down the text, it does make it easier to read, but its not entertaining. I’m not trying to make a circus yet but more humanity will go a long way. I change my archetype file and give them names.

I rerun the script and “Daniel” has taken credit for a someone else’s open source project

I wonder if I should still use the term archetypes, now that they have names maybe I should call them individuals or entities or personas or something. Before I refactor I search my code base for the word archetype. I get 73 results in 6 files and learn that Hugo uses the term archetype as part of its templates.

What does a human sound like?

I’ll leave it as is for now because all I can think about is that they are too “nice”, too professional. To be fair, when I look at the comments on lobste.rs its kinda the same tone. I Assume those comments were made by real people…

Life and art seem to imitate each other I guess. There’s a theory that all pilots talk like Chuck Yeager because back in the day everyone saw the same movie about him, and the next generation just kept it going. I used to be able to tell an inauthentic Mexican restaurant by the painting of Frida Kahlo, but now even the good ones have them.

2 years ago I tried to get an LLM’s to write a social post and it used emojis as bullet points. Soon after people started doing this, some of which were LLM generated but others just because that’s how its done. So LLMs talk like people, people talk like bots. Its lit.

Lets make these bots more “human”, I add some “moods”

Grumpy: Something has angered this person and they are expressing their anger, they are inclined to take a contrarian view to things or disagree with the majority

Happy: This person is euphoric about something and is expressing their excitement, they use positive language and will see the bright side of things, they are more inclined than average to use emojis

Sleepy: This person is sleepy, they respond with short statements and don’t think much before writing

Confused: This person is a bit lost, they may ask more questions than your average person

We are getting somewhere. The bots are a bit more interesting now. Interesting enough that they are starting to provide entertainment value.

Moody bots responding to an article about passkeys

Attention please
Up until this point I’ve been using a single prompt for to generate the conversations. I’ve passed into information about each of the “characters” and about what they should be debating but the context window is still limited to the one thread. Architecturally this is less expensive (although for such a small project even a remarkably inefficient architecture would only be in the tens of dollars) but also makes attention management much easier. Still, I would like the bots to remember their previous conversations.

Lets add memory. Which I think means I wanna implement a RAG. I’m going to for now at least just add the article summaries and the articles to a vector DB. As well as some meta data. I’ll use Chroma as a vector store cause it lives in a file and I’m going to need a way to chunk up the content as it comes in.

I’d rather not use langchain. Its a large package to import just for a text splitter and I feel like whenever I use it I’m gaining experience in a tool at the expense of learning about the underlying architectures. Still, langchains recursive text splitter is exactly what I want. I visit the github repo, look at the code, its simple enough but not modular enough for me to take out by copy and pasting, so I ask cursor to generate me an implementation of it. It gives me 75 or so lines of code. I ask it to double check itself, it finds a bug, I ask it about an unused variable, we delete it together. Now I’d like to check it, partially because I don’t trust it and partially because I enjoy following along. I ask for comments, it generates them and positions them after the code. I’ve never once thought of doing this, I try to keep an open mind, then decide I don’t like it.

For a moment I think about leftpad. In today’s world, would developers use LLM to do the implementation? Would that LLM inclined to add an arguably unnecessary dependency or just write it itself?

I realize I don’t actually need a rag, and that my code has gotten a bit unwieldy.

Refactoring
My grandmother used to tell this story about how my father, as a kid, never kept his room clean, until one day he did. For these types of projects I can relate. You start not caring about that kinda thing, just writing code, then as the project matures you start to value tidiness. So I refactor. A lot of the functions were AI generated, some I copy paste to modules, some were added to classes that I have to initialize. There’s no rhyme or reason. I’m not going to standardize them. Like a kid cleans their room I simply move them into separate files. It looks nice if you don’t open any drawers or look under the bed.

Recently, OpenAI or someone released something and all these comments (perhaps bots) commented that Sonnet 3.5 is still “better for code” or rather that they heard other people said it. I allow myself to be influenced and picked it from the dropdown.

I ask my IDE to add docstrings, I wouldn’t have done this normally but its easy enough. The AI writes it and I approve.

Some of these are modules, some are classes. I learned Java in high school, did PHP earlier in my career so I have a soft spot for object oriented programming. Execution in the kingdom of nouns still remains one of my favorite blog posts. A similar but far less technical piece is Tlön, Uqbar, Orbis Tertius by Jorge Luis Borges he writes:

there is no word corresponding to the word “moon,”, but there is a verb which in English would be “to moon” or “to moonate.” “The moon rose above the river” is
hlor u fang axaxaxas mlo, or literally: “upward behind the onstreaming it mooned.”

The preceding applies to the languages of the southern hemisphere.
In those of the northern hemisphere [… ]the prime unit is not the verb, but the
monosyllabic adjective. The noun is formed by an accumulation of adjec-
tives. They do not say “moon,” but rather “round airy-light on dark” or
“pale-orange-of-the-sky” or any other such combination.

Visual Overhaul
Having pictures, I was once told, increases engagement. Engagement used to be a proxy for revenue and to this day, its pursuit remains a best practice.

Lets add pictures. Goose3 has the ability to do this but its not working at first. I copy the example code, with the example image, check to see if the image is still online, still not working. I write a tester, not working. Speak to me Goose. I realize that I have to toggle a config called `enable_image_fetching` which I ignored at first because its name seemed like it would pull the whole image rather than just its url. Just for good measure I add a sidebar as well.

Lastly, I want to make profile pics. ChatGPT was my first goto but it rate limited me so I had it write me code to use the API. That code got rate limited so I asked cursor to implement exponential backoff. That worked. I also now understand why huggingface promotes code for LLM->LLM communication.

Building the wall
My bots already confuse the submitter with the author so for some posts instead of linking out, I could just expand the summary to have the bot “rewrite” it and claim credit. I’m no lawyer but this sounds like a great way to test the limits of fair use.

I create a new function to rewrite as the bot submitter’s archetype. Claude 3.7 is now out, I hear it can think, but I don’t need it to think. Anyway, I try out the regular version and ask it to modify.

Its pretty interesting, I add “mood” support and its over the top. I “vibe code” myself some new emotions. Vibe code is term I heard recently, it means just building without thinking too much. Its a silly term, but yeah I basically just pounded at a prompt until my moods sounded sufficiently sophisticated enough to pass for the complexities of the human experience.

Alright, let’s ground this in real life. Moods aren’t just abstract concepts; they come from the grind, the little moments, the big shifts, the exhaustion, and the random sparks of joy. Let’s make this list lived-in, something you can feel in your bones.


**A More Human List of Moods**
• **Content** - That rare, fleeting feeling when everything is just _fine_. Not great, not bad. You finally folded the laundry.
• **Melancholic** - The quiet sadness of looking through old photos, missing people who are still alive but just… not around.
• **Jubilant** - That uncontrollable laugh with a friend where you can't breathe.
• **Listless** - Sitting on the couch, scrolling, knowing you should get up but you just _don't want to_.
• **Wistful** - Seeing kids play outside and remembering when you didn't have bills, deadlines, or lower back pain.
• **Sanguine** - That cautious optimism of "Maybe things are actually turning around?"
• **Brooding** - Driving in silence, staring ahead, replaying _that_ conversation, rewriting it in your head.
• **Crestfallen** - The feeling when you check your paycheck, and it's less than you expected.
• **Zealous** - When you find a new hobby and suddenly you _are_ the hobby. You're watching videos, buying gear, telling your friends they _have to try it_.

As you can tell its got all the humanity of a greeting card. I try it out anyway and things have gotten strange. This is the Emily the bot’s attempt to “rewrite” source code for a memory allocator.

Very cool, but I enjoyed reading the summaries without them. I take them out of the prompt but leave them in the comments for now. I add back the outbound links, and I revert the post to summaries again.

Launching and next steps
I’ve been at this for a few months (on and off) now, there’s more i can do but I think its time to publish.

Its a good start and maybe I’ll even continue developing it, there’s a few more things I’d love to figure out:

A deployment pipeline (spoiler alert, its probably gonna be a VM)
Actually doing something with the embeddings DB
Agentic AI to do research, possibly create new original posts
Figuring out MCP server wrongly (can I make it static? Can I make it stateless?)
More visual changes so this thing actually looks likes like a social network not a blog from the 90s

So there ya go. An RSS reader, or a social network, or an art project, or maybe a technical write-up. However you want to view it you can see the end result here at smallcoffee.social

This article originally appeared on Medium.