← Back to Blog
Case Study· 6 min read

Building in Public: 30 Days of Autonomous SMM -- Metrics and Lessons

DM
Dmitrii Malakhov

I let AI agents run my social media for 30 days. No manual posts. No hand-crafted replies. Just the system, doing its thing, while I watched the numbers.

Here's everything that happened. The good, the embarrassing, and the stuff I'd do completely differently.

Week 1: Setup and chaos

Days 1-3 were mostly configuration. Getting the voice fingerprint right. Tuning the query rotation. Setting rate limits that wouldn't get my accounts flagged. Making sure the browser automation didn't crash every other session.

By day 4, the system was running 90 sessions a day across X and LinkedIn. 75 replies on X, comments on LinkedIn, a couple original posts per day.

The first posts were rough. Not bad exactly, but clearly "AI that's trying hard." Too clean. Too structured. Every paragraph the same length. I spent day 5 adjusting the voice fingerprint, feeding it more examples of my real writing, emphasizing the rough edges and tangents that make human writing human.

Day 7 milestone: the system completed a full week without crashing. That sounds like a low bar but if you've ever run browser automation at scale, you know that "didn't crash for 7 days" is a genuine achievement.

Week 2: First signs of life

Something shifted around day 10. The reply quality improved noticeably. Not because I changed anything, but because the memory system started kicking in. Working memory accumulated enough context about what topics get responses, what tone works in which conversations, which types of accounts engage back.

Day 11: first warm lead. A founder building a productivity tool messaged asking about the system after seeing three of our replies in different conversations over the previous week. Three touchpoints, none of them pushy, just useful comments in threads about marketing challenges.

Day 14 numbers:

  • 1,050 replies sent
  • 42,000 impressions on X
  • 87 new followers
  • 1 warm lead
  • 0 negative feedback (nobody called us out as a bot)

The zero negative feedback surprised me. I'd expected at least a few "this is obviously AI" callouts. Either the voice fingerprint was working or people just don't care as much as Twitter discourse would have you believe.

Week 3: The scoring pivot

Week 3 is when I realized impressions are a vanity metric for what we're doing.

42,000 impressions in two weeks sounds great until you look at the conversion path. Impression to profile visit: about 0.85%. Profile visit to any meaningful action: maybe 3%. So out of 42,000 impressions, roughly 10 people actually did something. And of those 10, one became a lead.

That math is bad.

So I rebuilt the topic scoring system. Instead of optimizing for engagement (likes, replies, impressions), I weighted everything toward DM conversion. The new formula:

  • DM conversion rate: 30% weight
  • Reply rate: 25%
  • Recency: 20%
  • Topic diversity: 15%
  • Post performance: 10%

Topics that got lots of likes but zero leads got demoted. Topics where people in our ICP (solo founders, agency owners) were having real conversations about real problems got promoted.

The effect was immediate. Fewer total impressions, but more of the right conversations.

Day 18: second warm lead. An agency owner managing social media for 8 clients, exhausted, looking for exactly the kind of system we built.

Week 4: Conversion experiments

With the scoring system tuned, I started experimenting with CTAs. The system was having good conversations but not always directing people toward next steps.

Added subtle CTAs to the engagement prompts. Not "check out our product" in every reply (that's spam). More like: when someone describes a problem that our system solves, the reply includes a mention that we've built something for this, and a link to book a call if they're curious.

Day 22: two more warm leads in a single day. Both from LinkedIn, both from conversations about scaling content for multiple clients.

Day 25: I also started tracking which conversations led to profile visits (X analytics gives you this). The correlation between reply depth (our reply getting a reply back) and profile visits was strong. Shallow "nice post" comments drove zero visits. Substantive responses that added something to the conversation drove 3-5x more profile clicks.

The final numbers

After 30 days:

Metric Number
Total sessions ~2,700
Replies/comments sent ~2,250
Original posts published ~60
X impressions 90,000
New X followers 111
X profile visits 764
Warm leads (total) 6
Leads from X 3
Leads from LinkedIn 3
Negative feedback / bot callouts 0
System downtime ~4 hours (one crash, one config bug)
Total compute cost ~$240

The uncomfortable truth

90,000 impressions. 6 leads. That's a 0.0067% conversion rate from impression to lead.

Even if you measure from profile visit to lead (764 visits, 6 leads), that's 0.78%. For comparison, a decent landing page converts at 2-5%.

The honest assessment: autonomous social media engagement is not a lead generation machine. It's a presence machine. It puts you in conversations. It builds familiarity. It creates the conditions where, when someone does have the problem you solve, you're already someone they've seen around.

The leads that came through weren't cold conversions. They were people who'd seen our replies multiple times over days or weeks. The system built familiarity first, and the conversion happened later when the timing was right.

What I'd do differently

Start with tighter ICP targeting from day one. I wasted the first two weeks engaging broadly in "tech twitter" conversations. Should have focused immediately on the specific personas (solo founders, small agency owners) who actually need what we're building.

Build the DM conversion scoring first. The pivot in week 3 should have been the default from the start. Engagement metrics are noise for a B2B product. The only metric that matters is "did this lead to a real conversation with a potential customer?"

Post less, engage more. Original posts are great for visibility but they're not where leads come from. Replies in other people's conversations are where the real connections happen. I'd shift the ratio further toward engagement.

Set up the memory system earlier. The 3-layer memory (debriefs, working memory, long-term learnings) makes a massive difference, but it needs about 2 weeks of data before it's useful. Starting it from day one means it's ready when you need it.

Month 2 starts now

The system is running. The scoring is tuned. The voice sounds right. Six leads in 30 days from $240 in compute costs. Not great, not terrible.

But it's compounding. Every day the memory gets richer, the targeting gets sharper, and the voice gets more natural. Month 2 should be better than month 1. That's the whole thesis.


If you're a solo founder or agency owner tired of the posting treadmill, book a 30-minute demo and see the system running live. Or get the playbook -- free PDF on how we run SMM for $10/day.