~/home/study/sharphound-data-collection

SharpHound Data Collection: Gathering AD Relationships - Intro Guide

Learn how to run SharpHound with default and custom collection methods, interpret its JSON output, filter noisy data, stay stealthy, and export results for BloodHound ingestion. This guide gives practical examples, mitigation tips, and hands-on exercises.

Introduction

SharpHound is the data-gathering engine behind the popular BloodHound graph analysis platform. It enumerates Active Directory (AD) objects and their relationships, producing a graph that attackers (and defenders) can query to discover privilege‑escalation paths, lateral‑movement opportunities, and misconfigurations.

Understanding how SharpHound collects data, what it returns, and how to shape that collection is essential for both red‑team engagements and blue‑team monitoring. In real‑world penetration tests, the quality of the collected graph often determines whether a silent foothold can be turned into domain admin.

Prerequisites

  • Active Directory Fundamentals: Objects, Trusts, and Permissions
  • PowerView for AD Enumeration: Discovering Users, Groups, and Computers
  • LDAP Query Techniques for Precise AD Data Extraction

If you are comfortable running PowerView commands, crafting LDAP filters, and interpreting security descriptors, you are ready to dive into SharpHound.

Core Concepts

SharpHound walks the AD forest, querying LDAP, the SAMR RPC interface, and the Windows Management Instrumentation (WMI) service. It builds two fundamental graph elements:

  1. Nodes: Represent AD objects such as users, computers, groups, GPOs, and containers.
  2. Edges: Represent relationships like memberOf, hasSession, allowedToDelegate, and hasSIDHistory.

Each node and edge carries a set of properties - for example, a user node includes samaccountname, objectsid, and pwdlastset. Edge properties describe the nature of the link, such as isacl (true if the relationship was discovered via an ACL). The final output is a collection of JSON files that BloodHound ingests to build the visual graph.

Because SharpHound can be noisy, it offers collection flags (-c) to narrow focus, and stealth flags (-d, -v) to reduce detection surface.

Running SharpHound with default and custom collection methods

The simplest way to start is the built‑in All collection set, which gathers every supported object type. The command is straightforward:

powershell -ExecutionPolicy Bypass -NoProfile -Command "Import-Module .\SharpHound.ps1; Invoke-SharpHound -CollectionMethod All -Domain yourdomain.local"

While effective, the All method can generate gigabytes of data in large forests and trigger alerts on high LDAP query volume.

Custom collection methods let you target specific data sources. The most common flags are:

  • -c Session - Enumerates active logon sessions on computers.
  • -c LocalAdmin - Finds local administrators on each host.
  • -c Trusts - Pulls forest and external trust relationships.
  • -c ACL - Parses security descriptors for permission edges.

Example: Collect only session and ACL data, which is often enough to map privilege‑escalation paths:

powershell -ExecutionPolicy Bypass -NoProfile -Command "Import-Module .\SharpHound.ps1; Invoke-SharpHound -CollectionMethod Session,ACL -Domain yourdomain.local"

Note the use of a comma‑separated list; SharpHound accepts either a space‑separated list or a single string with commas.

Understanding the JSON output schema (nodes, edges, properties)

After execution, SharpHound writes a series of .json files to the \BloodHound_timestamp folder. The three primary files are:

  • nodes.json - An array of objects, each representing a node. Example excerpt:
[ { "type": "user", "objectid": "S-1-5-21-1234567890-123456789-1234567890-1105", "properties": { "samaccountname": "jdoe", "displayname": "John Doe", "distinguishedname": "CN=John Doe,OU=Users,DC=corp,DC=example,DC=com", "pwdlastset": 1625097600, "enabled": true }
}, ... ]
  • edges.json - An array of relationship objects. Example:
[ { "source": "S-1-5-21-1234567890-123456789-1234567890-1105", "target": "S-1-5-21-1234567890-123456789-1234567890-512", "type": "MemberOf", "properties": { "isacl": false }
}, ... ]

Key points to remember:

  1. Object IDs are always SIDs, which BloodHound uses as unique identifiers.
  2. Edge Types are capitalized strings that map to BloodHound relationship names (e.g., MemberOf, HasSession, AllowedToDelegate).
  3. Properties may include isacl, isaclprotected, isaclinherit, and any custom attributes you requested with -Property.

Understanding this schema lets you manually inspect or post‑process data - for example, extracting only high‑value HasSession edges for domain controllers.

Filtering and limiting data to reduce noise (e.g., -c All, -c DCOnly)

Large environments can produce millions of edges. SharpHound offers two built‑in collection shortcuts that help you focus:

  • -c All - Full sweep (default). Use only when you have time and storage.
  • -c DCOnly - Collects only domain controller sessions, local admin memberships, and trusts. It is a minimal dataset that still reveals privileged paths.

Example command for a stealthy, DC‑centric collection:

powershell -ExecutionPolicy Bypass -NoProfile -Command "Import-Module .\SharpHound.ps1; Invoke-SharpHound -CollectionMethod DCOnly -Domain corp.example.com"

Beyond collection methods, you can limit scope with LDAP filters via the -Filter parameter (available in newer SharpHound releases). For instance, to collect only computers whose name ends with -SQL:

Invoke-SharpHound -CollectionMethod Session -Filter "(&(objectCategory=computer)(cn=*-SQL))"

These filters dramatically shrink JSON size and reduce the chance of triggering AD monitoring solutions that flag high‑volume queries.

Stealth considerations: using -d, -v, and low‑privilege accounts

SharpHound can be noisy not only in volume but also in the type of queries it issues. Two flags help you stay under the radar:

  • -d (Domain): Forces SharpHound to target a specific domain, preventing it from walking the entire forest.
  • -v (Verbose/Debug): Paradoxically, higher verbosity can help you see exactly which LDAP queries are being sent, allowing you to fine‑tune filters before executing a full run.

Running SharpHound under a low‑privilege account (e.g., a standard domain user) limits the data you can pull anyway - which is a natural stealth mechanism. However, many valuable edges (like HasSession on DCs) are still visible because session enumeration does not require elevated rights.

Best practice for stealth:

  1. Start with a low‑privilege account.
  2. Use -c DCOnly or a custom collection set.
  3. Run -v once to capture the LDAP query log, then craft a precise -Filter to cut down traffic.
  4. Optionally, schedule the collection during off‑hours to blend with legitimate admin activity.

Executing SharpHound as SYSTEM vs. regular user

Running SharpHound as SYSTEM (e.g., via a scheduled task, psexec, or a token‑stealing exploit) expands the data surface:

  • Access to HKLM\SECURITY registry hive, which can expose LSA secrets.
  • Ability to query the LSA RPC interface for privileged credentials.
  • Full read of every computer’s local SAM, revealing local admin passwords.

From a red‑team perspective, SYSTEM execution can produce edges like CanRDP that are otherwise hidden. However, it also raises a red flag for endpoint detection platforms that monitor for processes running as SYSTEM with network access.

Example of launching SharpHound as SYSTEM using psexec:

psexec -i -s powershell -ExecutionPolicy Bypass -NoProfile -Command "Import-Module C:\Tools\SharpHound.ps1; Invoke-SharpHound -CollectionMethod All"

When you have a low‑privilege foothold, consider starting with user‑level collection first, then elevate to SYSTEM only if the extra data is required for the engagement.

Exporting collected data for import into BloodHound

After SharpHound finishes, you will see a directory like \BloodHound\2024-05-28_13-45-22 containing *.json files and a bloodhound.zip archive. To ingest the data:

  1. Open the BloodHound desktop or web UI.
  2. Navigate to Upload DataUpload JSON.
  3. Select the bloodhound.zip file or the individual .json files.
  4. Click Upload. BloodHound parses the JSON and builds the graph.

If you need to merge multiple collections (e.g., from different domains), simply upload each bloodhound.zip sequentially; BloodHound deduplicates nodes based on SID.

For automation, you can use the BloodHound API (available via bloodhound-python) to programmatically push JSON:

import requests, json, os

API_URL = "BloodHound API endpoint"
TOKEN = "YOUR_JWT_TOKEN"

zip_path = r"C:\Collections\bloodhound.zip"
files = {"file": (os.path.basename(zip_path), open(zip_path, "rb"), "application/zip")}
headers = {"Authorization": f"Bearer {TOKEN}"}

resp = requests.post(API_URL, files=files, headers=headers, verify=False)
print("Status:", resp.status_code)
print("Response:", resp.json())

This is handy for continuous‑assessment pipelines where you run SharpHound from a CI/CD runner and push results directly to a central BloodHound server.

Practical Examples

Example 1: Low‑noise DC session harvest

# Step 1: Run as a standard user, collect only DC sessions
powershell -ExecutionPolicy Bypass -NoProfile -Command "Import-Module .\SharpHound.ps1; Invoke-SharpHound -CollectionMethod Session -Domain corp.example.com -Filter \"(&(objectCategory=computer)(userAccountControl:1.2.840.113556.1.4.803:=8192))\""

# Step 2: Zip and upload to BloodHound
Compress-Archive -Path .\BloodHound\* -DestinationPath dc_sessions.zip
# Upload via web UI or API (see API example above)

This yields a graph showing which users have active sessions on each domain controller – a critical step for identifying potential Kerberoasting or credential‑dumping opportunities.

Example 2: ACL‑focused collection on a segmented OU

# Target only the "Finance" OU for ACL analysis
$ou = "OU=Finance,DC=corp,DC=example,DC=com"
Invoke-SharpHound -CollectionMethod ACL -Domain corp.example.com -Filter "(distinguishedName=$ou)"

By limiting to a high‑value OU, you reduce noise while still uncovering privileged access control entries (e.g., Finance users granted WriteDACL on a privileged service account).

Tools & Commands

  • SharpHound - Invoke-SharpHound PowerShell function; also available as a compiled SharpHound.exe binary.
  • BloodHound - UI for graph visualization; API for automation.
  • PowerView - Complementary AD enumeration tool for validation.
  • ldapsearch - Command‑line LDAP client for testing filters before SharpHound runs.
  • psexec / schtasks - Techniques to execute SharpHound as SYSTEM.

Sample command to list available collection methods:

Get-Help Invoke-SharpHound -Parameter CollectionMethod

Defense & Mitigation

  • Monitor LDAP query volume: Set alerts on >1000 queries per minute from a single account.
  • Restrict Read permissions: Ensure that only necessary groups have ReadProperty on high‑value objects.
  • Enable Auditing: Enable Directory Service Access and Logon events; correlate with known SharpHound signatures.
  • Network segmentation: Isolate domain controllers from regular workstations to limit session enumeration.
  • Application whitelisting: Block execution of unknown PowerShell scripts unless signed.

Common Mistakes

  1. Running SharpHound from the domain controller itself – this floods the DC with local queries and can cause performance issues.
  2. Neglecting to zip the output – BloodHound expects a single archive; uploading raw JSON can cause duplicate node errors.
  3. Using the default All collection in large forests – leads to massive files, long upload times, and possible detection.
  4. Forgetting to set ExecutionPolicy Bypass on restricted hosts – the script may be blocked silently.
  5. Skipping the -Filter step when targeting a specific OU – results in unnecessary enumeration of the entire domain.

Always validate the size of the generated JSON before moving it to the BloodHound server.

Real‑World Impact

In a recent engagement with a multinational retailer, we used a DCOnly collection from a low‑privilege service account. The resulting graph highlighted a single user with HasSession on every domain controller – a clear indicator of a compromised admin credential. By extracting that edge, we were able to request a password reset and gain full domain control within hours.

Conversely, a blue‑team that had implemented LDAP query throttling and strict ACLs prevented the attacker from enumerating HasSession edges, forcing the red team to resort to credential dumping, which was detected by their endpoint protection.

Trend: As organizations adopt Zero Trust and continuous monitoring, the window for noisy All collections shrinks. Attackers now favor targeted ACL and Session collections combined with low‑privilege accounts to stay under the radar.

Practice Exercises

  1. Exercise 1 – Basic Collection: Using a standard domain user, run Invoke-SharpHound -CollectionMethod Session. Zip the output and import into a local BloodHound instance. Identify any users with sessions on domain controllers.
  2. Exercise 2 – Filtered ACL Harvest: Write an LDAP filter that selects only objects in the OU=HR OU. Run SharpHound with -CollectionMethod ACL and the filter. Examine the graph for over‑privileged permissions.
  3. Exercise 3 – SYSTEM Execution: Use psexec to launch SharpHound as SYSTEM. Compare the resulting nodes.json size to the user‑level run. Note any new edge types (e.g., CanRDP).
  4. Exercise 4 – Automated Upload: Write a short Python script that watches a folder for new bloodhound.zip files and automatically uploads them to a BloodHound API endpoint.

Document the findings and reflect on how each collection method changed the visibility of privileged paths.

Further Reading

  • "BloodHound – The Active Directory Attack Graph" – Official documentation.
  • “PowerView – PowerShell BloodHound Companion” – GitHub repository.
  • “Defending Active Directory: A Practical Guide” – Book by Sean Metcalf.
  • MITRE ATT&CK T1087 – Account Discovery, and T1069 – Permission Group Discovery.

Summary

SharpHound is a versatile AD data collector that can be tuned for breadth (All) or stealth (DCOnly, custom filters). Understanding its JSON schema helps you validate and post‑process data, while careful selection of collection methods, filters, and execution context (user vs. SYSTEM) balances information gain against detection risk. Exported JSON files feed directly into BloodHound, enabling rapid visualization of privilege‑escalation paths. By mastering these techniques, you can both uncover hidden attack routes and implement robust detection controls to thwart adversaries.