Ideas Worth Exploring: 2025-04-04

Charles Ray
Apr 3, 2025
4 min read

Updated: Apr 6, 2025

Ideas: Manuel Kießling - Senior Developer Skills in the AI Age: Leveraging Experience for Better Results

https://manuel.kiessling.net/2025/03/31/how-seasoned-developers-can-achieve-great-results-with-ai-coding-agents

Manuel Kießling reflects on their positive experiences and ideas using AI-powered coding tools in software development projects, both personally and professionally. Manuel Kießling argue that experienced developers are ideally positioned to benefit from these tools due to their accumulated knowledge and experience.

Manuel Kießling identifies three key measures for successful AI coding sessions: well-structured requirements, tool-based guard rails, and file-based keyframing. Manuel Kießling provide examples of using AI in green-field and brown-field projects, emphasizing the ability of AI to create functional applications even in unfamiliar tech stacks (green-field) and significantly reducing development time (brown-field).

Manuel Kießling highlights the importance of comprehensive requirements documentation, the use of quality tools to ensure code compliance, and file-based keyframing for maintaining control over code organization. They also share a real-world example that demonstrates how these principles work together in practice. The article concludes by emphasizing that traditional software development practices remain valuable in the age of AI-assisted development and that human experience is more important than ever in effectively utilizing AI tools.

Ideas: Adam Ard - In retrospect, DevOps was a bad idea.

https://rethinkingsoftware.substack.com/p/in-retrospect-devops-was-a-bad-idea

Adam Ard critiques the concept of DevOps, arguing that its formalization and creation as a distinct role led to its downfall. Before DevOps, developers were responsible for writing software and ensuring smooth deployment, but operations teams managed production.

Adam Ard suggests that developers should continue to manage their own code in production, including monitoring, responding to issues, and automating deployments.

The arrival of cloud services made this shift more natural as infrastructure could be programmed and automated. However, the naming of DevOps led to the creation of DevOps teams, which pulled engineers away from product teams and undid the benefits of DevOps. DevOps teams ended up building internal tooling that restrict rather than empowers, assuming developers cannot be trusted with production.

GitHub Repos: Hatchet: Run Background Tasks at Scale

https://github.com/hatchet-dev/hatchet

Hatchet is a platform for running background tasks, built on top of Postgres. Instead of managing your own task queue or pub/sub system, you can use Hatchet to distribute your functions between a set of workers with minimal configuration or infrastructure.

Background tasks are critical for offloading work from your main web application. Usually background tasks are sent through a FIFO (first-in-first-out) queue, which helps guard against traffic spikes (queues can absorb a lot of load) and ensures that tasks are retried when your task handlers error out. Most stacks begin with a library-based queue backed by Redis or RabbitMQ (like Celery or BullMQ). But as your tasks become more complex, these queues become difficult to debug, monitor and start to fail in unexpected ways.

This is where Hatchet comes in. Hatchet is a full-featured background task management platform, with built-in support for chaining complex tasks together into workflows, alerting on failures, making tasks more durable, and viewing tasks in a real-time web dashboard.

GitHub Repos: SpacetimeDB: Multiplayer at the speed of light.

https://github.com/clockworklabs/SpacetimeDB

You can think of SpacetimeDB as both a database and server combined into one.

It is a relational database system that lets you upload your application logic directly into the database by way of fancy stored procedures called "modules."

Instead of deploying a web or game server that sits in between your clients and your database, your clients connect directly to the database and execute your application logic inside the database itself. You can write all of your permission and authorization logic right inside your module just as you would in a normal server.

This means that you can write your entire application in a single language, Rust, and deploy it as a single binary. No more microservices, no more containers, no more Kubernetes, no more Docker, no more VMs, no more DevOps, no more infrastructure, no more ops, no more servers.

GitHub Repos: OWASP secureCodeBox

https://github.com/secureCodeBox/secureCodeBox

secureCodeBox is a kubernetes based, modularized toolchain for continuous security scans of your software project. Its goal is to orchestrate and easily automate a bunch of security-testing tools out of the box.

With the secureCodeBox, a toolchain is provided for continuous scanning of applications to find the low-hanging fruit issues early in the development process and free the resources of the penetration tester to concentrate on the major security issues.

The purpose of secureCodeBox is not to replace the penetration testers or make them obsolete. We strongly recommend to run extensive tests by experienced penetration testers on all your applications. The secureCodeBox is no simple one-button-click-solution! You must have a deep understanding of security and how to configure the scanners.

Ideas: HateBench - a framework designed to benchmark hate speech detectors on LLM-generated content.

https://github.com/TrustAIRLab/HateBench/tree/main

https://arxiv.org/abs/2501.16750

HateBench, the framework designed to benchmark hate speech detectors on LLM-generated content.

HateBenchSet, the manually-annotated dataset, comprising 7,838 samples across 34 identity groups, generated by LLMs.

Code for reproducing the LLM hate campaign, including both the adversarial hate campaign and stealthy hate campaign.

Large Language Models (LLMs) have raised increasing concerns about their misuse in generating hate speech. Among all the efforts to address this issue, hate speech detectors play a crucial role. However, the effectiveness of different detectors against LLM-generated hate speech remains largely unknown. In the paper, HateBench was developed, a framework for benchmarking hate speech detectors on LLM-generated hate speech. The authors constructed a hate speech dataset of 7,838 samples generated by six widely-used LLMs covering 34 identity groups, with meticulous annotations by three labelers.

The authors assessed the effectiveness of eight representative hate speech detectors on the LLM-generated dataset. The results show that while detectors are generally effective in identifying LLM-generated hate speech, their performance degrades with newer versions of LLMs.