Friday, 27 February 2026

AWS production log analysis using Claude.ai inside IntelliJ — step-by-step

 Short version: collect the logs from AWS (CloudWatch / S3), open them in an IntelliJ project, install a Claude/Claude-Code plugin for JetBrains, and use Claude interactively to parse, summarize, write queries, and triage root cause. Below is a practical, reproducible step-by-step guide with code snippets, CloudWatch examples, and ready-to-paste prompts you can use inside the IDE.

Note: there are several IntelliJ plugins that bring Anthropic/Claude functionality into JetBrains IDEs (official and community plugins). Pick one you trust and read its permissions before installing.


1) Plan & safety checklist (do this first)

  1. Never store prod credentials in source code. Use IAM roles, temporary credentials, or an encrypted secrets store (AWS Secrets Manager / SSM Parameter Store / Vault).

  2. Work on redacted or sampled production logs if possible (PII, tokens, IPs).

  3. Ensure your organization’s policy allows sending excerpts to external AI services — redact or anonymize anything you cannot transmit.

  4. Make a small sample of logs (1–10 MB) for initial exploration to avoid cost and leakage.


2) Get the logs out of AWS (options)

You’ll usually choose one of these:

A. CloudWatch Logs Insights — run queries interactively and export results. Good for ad-hoc queries.
B. Export to S3 — for bulk analysis (historical, large datasets).
C. Kinesis / Lambda — for streaming analysis (near real time).

Example CloudWatch Insights query (find errors in last 24 hours):

fields @timestamp, @message, @logStream
| filter @message like /(?i)error|exception|traceback/
| sort @timestamp desc
| limit 200

If you prefer to pull logs programmatically, use the AWS SDK. Example Python (boto3) to run an Insights query and download results:

# save as fetch_cw_insights.py
import boto3, time, csv

client = boto3.client("logs", region_name="ap-south-1") # change region

def run_insights_query(log_group_names, query_string, start, end):
resp = client.start_query(
logGroupNames=log_group_names,
startTime=int(start),
endTime=int(end),
queryString=query_string,
limit=1000
)
qid = resp["queryId"]
while True:
r = client.get_query_results(queryId=qid)
if r["status"] in ("Complete", "Failed", "Cancelled"):
break
time.sleep(1)
return r

if __name__ == "__main__":
from datetime import datetime, timedelta
end = int(datetime.utcnow().timestamp())
start = int((datetime.utcnow() - timedelta(hours=24)).timestamp())
q = 'fields @timestamp, @message | filter @message like /error/ | sort @timestamp desc | limit 100'
res = run_insights_query(["/aws/lambda/my-prod-func"], q, start, end)
# write messages to CSV
with open("cw_insights.csv","w", newline="") as f:
w = csv.writer(f)
w.writerow(["timestamp","message"])
for row in res.get("results", []):
ts = next((r["value"] for r in row if r["field"]=="@timestamp"), "")
msg = next((r["value"] for r in row if r["field"]=="@message"), "")
w.writerow([ts, msg])
print("Saved cw_insights.csv")

3) Create a local IntelliJ project and import logs

  1. Open IntelliJ IDEA → New Project → Empty project (or open an existing repo).

  2. Create a folder logs/ and drop cw_insights.csv, or create a small logs.jsonl.

  3. (Optional) Add a small script folder tools/log_analysis/ with the Python script above.


4) Install a Claude / Anthropic plugin into IntelliJ

There are official and community plugins that integrate Claude/Claude Code into JetBrains IDEs — search the JetBrains Marketplace inside IntelliJ and install a plugin that suits your security model (some require an API key; others use a linked account). Examples of available integrations and plugin docs are on Anthropic/Claude pages and the JetBrains Marketplace.

Typical install steps:

  1. IntelliJ → Settings/Preferences → Plugins → Marketplace → search Claude / Claude Code / Claude Code Plus / Claude GUI.

  2. Install and restart IDE.

  3. Open the plugin settings (Tools → Claude or a dedicated toolwindow) and configure authentication:

    • Either input your Anthropic API key (if using a community plugin that requires it), or

    • Connect via the plugin’s sign-in flow (some official integrations require a subscription).

  4. Configure model (e.g., Claude 3.5/4, Sonnet/Opus depending on plugin choices).

Security note: prefer plugins that let you bring your own API key or that run locally. Inspect plugin source or vendor reputation if handling sensitive logs.


5) Basic workflow inside IntelliJ with Claude

Once the plugin is installed you’ll have a Claude tool window (or a chat pane). Use the pattern below:

A. Summarize a log file

  • Select a chunk of lines in the CSV / open the file.

  • Prompt (inside plugin chat):
    "Summarize the following log lines. Give a short bulleted summary of errors, likely root causes, and three suggested next steps. Remove timestamps and redact IP addresses."

  • Paste the log excerpt or the file context.

B. Ask Claude to write parsing / extraction code

  • Prompt: "Write a Python function that reads cw_insights.csv and extracts fields: timestamp, level, service, message. Use regex robust to JSON-log and plain text log entries. Return a list of dicts."

  • Paste the generated code into tools/log_analysis/parse_logs.py, run from IntelliJ terminal or run configurations.

C. Convert natural language to CloudWatch Insights

  • Prompt: "Generate a CloudWatch Insights query that returns the top-10 slowest requests (HTTP path and 90th percentile latency) for the last 3 hours from these logs."

  • Claude will produce a query you can run in CloudWatch.

D. Create unit tests / quick checks

  • Ask Claude to produce small unit tests for your parsing function. Paste into tests/test_parse.py and run.

E. Create an RCA report

  • After Claude summarizes and classifies errors, ask for a structured RCA template (title, impact, timeline, cause, remediation, mitigations).


6) Sample prompts you can copy/paste

  • Summarize logs:
    "Summarize these 200 log lines. Output: (1) 3-line summary; (2) Most frequent error signatures; (3) Hypotheses for root cause; (4) 3 recommended next steps for engineers (short actionable items)."

  • Regex for parsing:
    "Write a Python regex that extracts HTTP method, path, status code, latency_ms from log messages like: 'INFO 2026-02-01 Request GET /api/v1/users 200 123ms' and also handles JSON entries with keys method, path, status, latency."

  • Prioritization:
    "From this set of error messages, rank the top 5 unique error signatures by estimated impact (frequency × severity). Explain your calculation."


7) Example: using Claude to build a simple analyzer script

Ask Claude to generate code — example Python that parses csv, aggregates counts, and prints top errors:

# tools/simple_analyzer.py
import csv
from collections import Counter
import re, json

def parse_line(msg):
# naive: try JSON first
try:
j = json.loads(msg)
return j.get("level"), j.get("message", "")
except:
# fallback regex
m = re.search(r"(ERROR|WARN|INFO)\s+.*?\s(.*)", msg)
if m:
return m.group(1), m.group(2)
return None, msg

def analyze(path):
c = Counter()
with open(path, newline='') as f:
r = csv.DictReader(f)
for row in r:
_, message = parse_line(row.get("message",""))
# normalize error signature (trim numbers, ids)
sig = re.sub(r"\\b[0-9a-f]{6,}\\b", "<id>", message)
sig = re.sub(r"\\d{2,}", "<num>", sig)
c[sig.strip()[:200]] += 1
for sig, count in c.most_common(20):
print(f"{count:5d} {sig}")

if __name__ == "__main__":
analyze("cw_insights.csv")

You can ask Claude to refine the normalization rules or to output to CSV/JSON for dashboarding.


8) How to use Claude for root-cause / hypothesis generation

  • Provide context: service name, recent deploys, error timestamps, surrounding logs.

  • Ask Claude to generate differential hypotheses — e.g., code bug vs config vs infra vs network.

  • Ask for evidence to confirm or falsify each hypothesis (what query or metric to run — CPU, 5xx rate, DB connection errors).

Example prompt:
"Given these error signatures and the fact that we deployed service X at 02:10 UTC, propose three plausible root causes and for each give two concrete checks (CloudWatch metric or log query) that will confirm or rule it out."


9) Move from ad-hoc to repeatable

  • Save your Claude prompts as templates in the plugin (many plugins let you save conversations or snippets).

  • Wrap common Claude interactions in scripts: e.g., a script that extracts top 100 error messages and opens them in a Claude chat for summarization.

  • Automate exports from CloudWatch to S3 and run nightly analyses on samples (with redaction).


10) Visualizing / reporting

  • After Claude produces structured output (JSON/CSV), import into your favorite dashboard (Quick options: Grafana, Kibana, or a simple Excel/Google Sheets).

  • Use IntelliJ to iterate on transformation scripts and keep them in version control.

  • For high-value incidents, use Claude to draft the postmortem text (timeline, impact, mitigation, follow-ups) and then edit for accuracy.


11) Example security & compliance reminders

  1. Redact PII and secrets before sending data to external AI models if your policy forbids it.

  2. Audit plugin network access (some plugins call external MCP servers). If in doubt, prefer CLI tools that run locally and let you paste only the minimal excerpt into Claude.

  3. Keep logs stored with least privilege and use short-lived tokens for programmatic access.


12) Troubleshooting tips (IntelliJ + Claude)

  • If the plugin needs an external claude CLI, install and ensure it’s on PATH before opening the plugin.

  • If the plugin UI is slow, increase memory for IntelliJ (VM options) or use smaller excerpts.

  • If you hit rate limits, move heavier analysis to local scripts and use Claude only for summarization and guidance.


13) Real example workflow (concise)

  1. Export 1,000 error lines from CloudWatch Insights to cw_insights.csv.

  2. Open project in IntelliJ; install Claude plugin; authenticate.

  3. Select 200 lines in cw_insights.csv → ask Claude: summarize & produce 3-step remediation.

  4. Ask Claude to write a parser → paste the code into tools/parse_logs.py and run tests.

  5. Use the parser output to produce a frequency table and create a Grafana panel or CSV report.

  6. Use Claude to draft an RCA and email template for stakeholders.


14) Additional resources & reading

  • Anthropic / Claude Code docs for JetBrains integrations (plugin docs and official guides).

  • JetBrains Marketplace (search “Claude”, “Claude Code”, “Claude GUI”) for plugin options and install instructions.


Final tips

  • Start small — use short, redacted excerpts in the IDE to validate findings.

  • Use Claude to generate reproducible queries and code, but always validate outputs against the raw logs — AI suggestions are helpful, not authoritative.

  • Keep a reproducible pipeline: CloudWatch → S3 → tools/ scripts → Claude-assisted summaries → dashboards & RCA.

No comments:

Post a Comment

Note: only a member of this blog may post a comment.