Announcing YetiHunter: An open-source tool to detect and hunt for suspicious activity in Snowflake

Illustration Cloud

Intern Showcase: Anonymizing Logs Made Easy with LogLicker

logo-github LogLicker On GitHub: 


Logs play a crucial role in monitoring and analyzing system activity, but handling sensitive information within them can be a daunting task. Whether you're sharing logs with third-party services like ChatGPT or revealing examples of malicious activities on online forums, the process of finding and replacing sensitive data is time-consuming and error prone.

Introducing LogLicker

LogLicker is a tool designed to simplify the process of anonymizing logs by replacing sensitive information with randomized placeholders. While LogLicker was designed to address AWS CloudTrail logs, it can also be configured to work with other log types. This enables you to share logs more freely and perform analyses without compromising data privacy. LogLicker is written in python and requires the packages: boto3, exrex, and regex.

Key Features:

  • Information Detection: LogLicker uses regular expressions (regexes) to identify sensitive data within the logs. This can include IP addresses, email addresses, access keys, and more.
  • Information Replacement: LogLicker uses regular expressions to generate random values that correspond closely to the sensitive data, replacing it with these similar values.
  • Manifest File: LogLicker creates a manifest file that maps the original sensitive information to the corresponding replacement values. This manifest file can be stored and used later to deanonymize the data.
  • Customization: By editing the default regex files or adding new ones, this utility can also be used to identify sensitive information from other services.

How LogLicker Works:

  1. Data Source Selection: You can use LogLicker to process logs from either a local text file or directly fetch them from the CloudTrail API.
  2. Sensitive Information Detection: LogLicker uses regexes to scan the logs and identifies any instances of sensitive information that need to be anonymized.
  3. Safe Replacement: When LogLicker detects sensitive data, it generates values based on the regexes, preserving the data’s format and context without compromising privacy.
  4. Manifest Creation: LogLicker generates a manifest file of changes made. The manifest can be used to identify the information found within the logs and input it back into the tool for easy deanonymization.

Use Cases:

  • Secure Sharing: Anonymize logs before sharing them with third-party services or online communities, ensuring sensitive data remains protected.
  • Analysis: Anonymized logs enable analysis of logs without risking exposure of sensitive information.
  • Identification of Sensitive Information: By using LogLicker and the generated manifest file, you can more easily view all discovered sensitive information.

Use Examples

Help -h

The three subparsers each have their own help details.

python3 rawtext -h
python3 cloudtrail -h
python3 rawcloudtrail -h


Example 1 - Typical Use Case

This is an example of using LogLicker to anonymize logs, processing the logs through a third-party service, then deanonymizing the logs.

1. Process logs from input/rawcloudtrail.txt and output anonymized logs to output/anonymizedrawtext.txt. Save the manifest file to output/sensitiveinfomanifest.json. (If no output is provided, a default name is used in the tool’s ‘output’ directory. All output will append a unique hash resulting from the reviewed data.)

python3 rawtext -ifp output/rawcloudtrail.txt -ofp output/anonymizedrawtext.txt -omfp output/sensitiveinfomanifest.json

Snippet from output/anonymizedrawtext.txt:

    “eventVersion”: “1.09",
    “userIdentity”: {
        “type”: “IAMUser”,
        “principalId”: “AIDAZIA2LHZ7HRF5DKD6”,
        “arn”: “arn:aws:iam::668284825298:user/random-generated-nameRXnmhQqIgPwq”,
        “accountId”: “668284825298”,
        “accessKeyId”: “AKIALZ6DQM4QEAVPM4OK”,
        “userName”: “random-generated-nameRXnmhQqIgPwq”
    “eventTime”: “2023-08-03T14:51:07Z”,
    “eventSource”: “”,
    “eventName”: “LookupEvents”,
    “awsRegion”: “eu-southnorth-1”,
    “sourceIPAddress”: “”,
    “userAgent”: “Boto3/1.27.0 md/Botocore#1.30.0 ua/2.0 os/windows#10 md/arch#amd64 lang/python#3.11.4 md/pyimpl#CPython cfg/retry-mode#legacy Botocore/1.30.0”,
    “requestParameters”: {
        “startTime”: “Jan 1, 2023, 12:00:00 AM”,
        “endTime”: “Feb 2, 2023, 12:00:00 AM”,
        “maxResults”: 1000
    “responseElements”: null,
    “requestID”: “474571a3-a575-4468-9674-1ae1fc4752e9",
    “eventID”: “a709d759-acbb-49db-8a9f-faf013b923b0",
    “readOnly”: true,
    “eventType”: “AwsApiCall”,
    “managementEvent”: true,
    “recipientAccountId”: “668284825298",
    “eventCategory”: “Management”,
    “tlsDetails”: {
        “tlsVersion”: “TLSv1.2”,
        “cipherSuite”: “ECDHE-RSA-AES128-GCM-SHA256”,
        “clientProvidedHostHeader”: “

Snippet from output/sensitiveinfomanifest.json:

    "longTermAccessKeyID": {
    "shortTermAccessKeyID": {
    "publicKeyID": {},
    "stsServiceBearerTokenID": {},
    "contextSpecificCredentialID": {},
    "groupID": {},
    "ec2InstanceProfileID": {},
    "iamUserID": {},
    "managedPolicyID": {},
    "roleID": {
    "certificateID": {},
    "accountID": {
        "497402563602": "668284825298"
    "username": {
        "random-nameqPHgAoU59wq5": "random-generated-nameRXnmhQqIgPwq",
        "random-namemFoXAeql5rNA": "random-generatedZIsNOnywsrZQ",
        "random-generated-namebTrHRCf1xNG0": "random-namexQj9TnMBCcoa"
    "arn": {
        "assumed-role/random-generated-namebTrHRCf1xNG0/EC2_ORCHESTRATOR7978706923425647585": "random-generated-nameMcBMncaOPYLI",
        "assumed-role/random-generated-namebTrHRCf1xNG0/MandoService-1720578260957050536": "random-nameoN9HFyT6NipP"
    "instanceID": {},
    "ipv4": {
        "": ""
    "region": {
        "cn-west-1": "eu-south-1"
    "email": {},
    "specifiedStrings": {}


2. Put output/anonymizedrawtext.txt through any third-party service, with limited risk of leaking sensitive information.

3. De-anonymize the logs using the manifest:

python3 rawtext -ifp output/anonymizedlogsafter3rdpartyservice.txt -ofp output/deanonymizedrawtext.txt -imfp output/sensitiveinfomanifest.json -da true

This will use the manifest to replace all the anonymized information with the original information.


Example 2 - Identifying information

This is an example of using LogLicker to find instances of long-term access keys within CloudTrail logs from 2021-12-01 to present.

python3 cloudtrail -omfp output/sensitiveinfo.json  -s 2021-12-01 -l 2000 -rl longTermAccessKeyID

output/sensitveInfo.json then contains:

    "longTermAccessKeyID": {
        "AKIA3SXRZM04YTICAJ25": "AKIAM9A62U40E3GDAS27",
    "shortTermAccessKeyID": {},
    "publicKeyID": {},
    "stsServiceBearerTokenID": {},
    "contextSpecificCredentialID": {},
    "groupID": {},
    "ec2InstanceProfileID": {},
    "iamUserID": {},
    "managedPolicyID": {},
    "roleID": {},
    "certificateID": {},
    "accountID": {},
    "username": {},
    "arn": {},
    "instanceID": {},
    "ipv4": {},
    "region": {},
    "email": {},
    "specifiedStrings": {}





Illustration Cloud

Related Articles

Introducing YetiHunter: An open-source tool to detect and hunt for suspicious activity in Snowflake

Summary On May 30, 2024 Snowflake confirmed many clients were affected by an attacker leveraging compromised NHI credentials to perform data theft. In their notice, Snowflake included some indicators and suggested hunts. Our good friends at Mandiant

Extending Cloud Console Cartographer With New Mappings

Last month Permiso’s P0 Labs released the Cloud Console Cartographer open-source framework and corresponding research presentation at Black Hat Asia in Singapore. Recently we released our full suite of unit tests. Now let’s talk about how to extend

Unmasking Adversary Cloud Defense Evasion Strategies: Modify Cloud Compute Infrastructure Part 2

Detection and Mitigation The 'Create Snapshot', ‘Create Cloud Instance’, ‘Delete Cloud Instance’, ‘Revert Cloud Instance’ and ‘Modify Cloud Compute Configurations’ features are widely available across major cloud platforms such as AWS, Azure, and

View more posts