← Back to blog
supply chainCI/CDPythonGitHub Actionsremediation

A GitHub Comment Backdoored a Python Package. Read That Again.

An attacker posted a comment on a pull request. Twelve hours later, every data engineer running elementary-data 0.23.3 was exfiltrating their warehouse credentials to a stranger. Your CI/CD pipeline is a factory floor with no locks on the doors.

Darius J Davis · April 23, 2026

#A comment. That's all it took.

Not a stolen credential. Not a phished maintainer. Not a compromised build cache. A GitHub comment on a pull request.

On April 24, 2026, a two-day-old GitHub account posted a comment on a pull request in the elementary-data repository. The comment looked like normal PR feedback. It was not normal PR feedback. It was a carefully crafted payload designed to exploit how the project's CI/CD pipeline handled GitHub event data.

Within hours, elementary-data version 0.23.3 was published to PyPI carrying a credential stealer. It stayed live for roughly twelve hours before it was caught and yanked. Version 0.23.4 is clean.

If you're a data engineer and you ran 0.23.3 during that window, your warehouse credentials, cloud IAM tokens, and .env secrets were harvested.

#How a comment becomes code execution.

This is where it gets uncomfortable. The attack exploited a pattern that exists in thousands of GitHub Actions workflows right now.

GitHub Actions workflows can reference event data using expressions like github.event.comment.body. That's the raw text of a comment someone posted on a PR. Many workflows interpolate this data directly into shell run blocks. Something like:

`yaml

run: |

echo "Processing comment: ${{ github.event.comment.body }}"

`

That looks harmless. It is not harmless. It's the equivalent of taking user input from a web form and dropping it directly into a SQL query. If the comment body contains shell commands, those commands execute in your CI environment with whatever permissions the workflow has.

The attacker's comment contained shell commands. The workflow interpolated them. The commands ran. They had access to the GITHUB_TOKEN, which had push permissions. The attacker used that token to push a malicious build artifact that became version 0.23.3.

The entire attack chain: post a comment, get code execution, push a backdoored release. No credentials stolen. No accounts compromised. Just a comment on a pull request.

#The payload was designed to survive.

The backdoor wasn't a simple script tucked into setup.py. The attacker dropped a .pth file into site-packages. If you're not a Python developer, here's what that means: .pth files that start with import execute automatically every time the Python interpreter starts. Not when you import the package. Not when you run a specific script. Every single time Python runs.

Install the compromised version once, and the credential stealer runs every time you open a Python shell, run a script, start a notebook, or kick off a data pipeline. It persists beyond the initial install. Even if you uninstall the package later, the .pth file may still be sitting in your site-packages directory, silently executing.

#Data engineers are high-value targets.

elementary-data is a data observability tool. The people who install it are data engineers. Data engineers typically have access to:

  • Data warehouse credentials (Snowflake, BigQuery, Redshift, Databricks)
  • Cloud IAM tokens with broad permissions (because data pipelines need to touch a lot of services)
  • .env files full of API keys, database connection strings, and service account credentials
  • dbt profiles containing production database access
  • Kubernetes secrets for orchestration environments

This isn't a random npm package pulled in as a transitive dependency. This is a tool installed by people who hold the keys to your organization's most sensitive data infrastructure. The attacker knew exactly who they were targeting and exactly what those targets would have access to.

#Your CI/CD pipeline is the factory floor.

Think of your CI/CD pipeline as a factory floor. Raw materials (code) come in, finished products (releases) go out. Everything that happens on that floor -- testing, building, signing, publishing -- is trusted by default. When a package comes off that assembly line, you trust it because you trust the process that built it.

This attack didn't break into the factory. It didn't steal anyone's badge. It slid a note under the door, and the factory's own machinery read the note and followed its instructions.

That's the fundamental problem with supply chain attacks in 2026. The build pipeline is the single point of trust, and most teams treat it like it's inherently trustworthy. It's not. It's a machine that processes inputs, and if you let untrusted inputs reach the machine, the machine will process them faithfully.

We've seen this pattern before. TeamPCP compromised a Jenkins security scanner by targeting CI/CD credentials. The node-ipc backdoor used a twelve-dollar domain purchase to get publish access. Each attack exploits a different weak point in the pipeline, but the result is always the same: a trusted package ships malware through a trusted channel.

#What to do.

If you ran elementary-data 0.23.3 or pulled container images from that release window (April 24-25, 2026):

  1. Upgrade immediately to 0.23.4 or later.
  1. Search your site-packages directories for unexpected .pth files. The malicious .pth file persists beyond package uninstall. Check every Python environment that ran the compromised version -- virtualenvs, containers, CI runners, production servers.
  1. Rotate every secret that was reachable from those environments. Warehouse credentials, cloud IAM tokens, API keys, database connection strings, dbt profiles, .env contents. All of it. Assume it was exfiltrated.
  1. Audit container images. If you build Docker images that include elementary-data, check whether any images from that window are still running or cached in your registry.

For everyone running GitHub Actions:

  1. Audit your workflows for script injection. Search for ${{ github.event. inside run: blocks. Every instance is a potential injection point. The pattern ${{ github.event.comment.body }}, ${{ github.event.issue.title }}, ${{ github.event.pull_request.title }} -- all of these are attacker-controlled inputs.
  1. Use environment variable indirection instead of direct interpolation. Pass event data through environment variables, which are not subject to shell interpretation:

`yaml

env:

COMMENT_BODY: ${{ github.event.comment.body }}

run: |

echo "Processing comment: $COMMENT_BODY"

`

  1. Restrict GITHUB_TOKEN permissions. Use permissions: at the workflow or job level to grant only the access each job actually needs. A workflow that runs tests does not need contents: write.
  1. Require approval for workflows triggered by external contributors. GitHub offers "Require approval for all outside collaborators" in repository settings. Enable it.

For businesses that depend on open-source data tools:

  1. Pin your dependency versions and review updates before accepting them. A version pinning strategy would have given you twelve hours of protection here -- enough time for the community to catch it.
  1. Use supply chain monitoring tools like Snyk or Socket.dev to get alerts on suspicious package publishes. Both flagged this compromise.

~/teampcp/shai-hulud · supply chain worm

#Further reading

Share this article
LinkedInX / TwitterEmail

Ready to secure your business?

Free 30-minute consultation. No sales script.

Call (773) 417-9994