Azure-Sentinel/Sample Data
..
ASIM
CEF
Corelight
Custom
Event
Feeds
Fortinet FortiNDR Cloud
Media
PublicFeeds/MITREATT&CK
Sample Data/Custom
SecurityEvent
Syslog
ThreatIntelligence
VMwareSASE_SDWAN
Vectra AI Stream
AADUSerInfo.csv
AFD-WAF_SampleLogs_data.csv
AIA-Darktrace.csv
AIVectraDetect.csv
AppGW-WAF_SampleLogs_data.csv
AristaAwakeSampleData.csv
ArmisActivities.csv
ArmisAlerts.csv
ArmisDevice.csv
AsimAuthenticationCynerioEvents_schema.csv
AsimNetworkSessionCynerioEvents_schema.csv
Authomize_v2_CL.csv
AzurePurview_SampleData.csv
CiscoMeraki-RestAPI.json
CitrixAnalytics_indicatorEventDetails_CL.json
CitrixAnalytics_indicatorSummary_CL.json
CitrixAnalytics_riskScoreChange_CL.json
CitrixAnalytics_userProfile_CL.json
Citrix_WAF_Sample_DAA_CEF.csv
CommvaultSecurityIQ_CL.json
CommvaultSecurityIQ_CommvaultEvents_CL.json
Cribl_Access_logs.json
Cribl_Audit_logs.json
Cribl_Internal_logs.json
Cribl_UI-Access_logs.json
Cynerio_Authentication_query_data.csv
Cynerio_Cynerio_Authentication_IngestedLogs.csv
Cynerio_Cynerio_NetworkSession_IngestedLogs.csv
Cynerio_IngestedLogs.csv
Cynerio_NetworkSession_query_data.csv
Cynerio_RawLogs.json
Cynerio_Schema.csv
DynatraceAttacks_CL.json
DynatraceAttacks_IngestedLogs.csv
DynatraceAttacks_RawLogs.json
DynatraceAttacks_Schema.csv
DynatraceAuditLogs_CL.json
DynatraceAuditLogs_IngestedLogs.csv
DynatraceAuditLogs_RawLogs.json
DynatraceAuditLogs_Schema.csv
DynatraceProblems_CL.json
DynatraceProblems_IngestedLogs.csv
DynatraceProblems_RawLogs.json
DynatraceProblems_Schema.csv
DynatraceSecurityProblems_CL.json
DynatraceSecurityProblems_IngestedLogs.csv
DynatraceSecurityProblems_RawLogs.json
DynatraceSecurityProblems_Schema.csv
Egress Defend_RawLogs.json
Egress Defend_Schema.csv
EgressDefendSampleData.csv
ForgeRock_CEF.csv
JamfProtectExampleData.csv
MailGuard365_Threats_CL.csv
Microsoft.IoT-Dump-pwd-infected.zip
Microsoft_Lolbas_Execution_Binaries.csv
Perimeter81_ActivityLogs_sample.csv
Phosphorus_CL.csv
README.md
RidgeSecurity_IngestedLogs.csv
RidgeSecurity_RawLogs.txt
RidgeSecurity_Schema.csv
SalemCyber.csv
Sevco_IngestedLogs.csv
Talon_CL.csv
Talon_Insights_sample.csv
ValenceSecurity.csv
VaronisAlerts_CL.csv
Vcenter_RawLogs.txt
VectraStream_CL.json
WatchGuardFirebox_syslog_data.csv
ctm360CCP-cbs.json
githubscanaudit_CL.json
intel471_titan_API_malware_indicators.json
prancer_CL.json

README.md

This folder has sample data for different data connectors that can be leveraged by all Microsoft Sentinel contributions

Sample Data Contribution Guidance

Sample data is extremely useful when troubleshooting issues, supporting and/or enhancing the Data Connectors with more Security-focused content (such as Analytics, Hunting Queries, Workbooks, etc.). So, for every data connector committed, authors must also upload the following three (3) files:

Expected file name Source Expected samples in the file Expected file extension
ProductName_RawLogs Product Should contain raw logs directly from the source of the logs .txt* (for CEF/Syslog based Data Connectors) or .json (for API – based Data Connectors)
ProductName_IngestedLogs Log Analytics Workspace Should contain logs exported after ingestion into a Log Analytics Workspace .csv* for all Data Connectors
ProductName_Schema Log Analytics Workspace Should have the schema exported from Log Analytics .csv* for all Data Connectors

Note: Replace "ProductName" with the actual name of the Product or data connector.

*Guidance on how to extract these files is below.

Important: Contributors must upload log samples of all types of events that are generated by the product and captured by the data connector. These events may include different event results and response actions that the product generates. Its also important to ensure that log details include fields and/or values that include information that can be normalized. Please refer to the Advanced Security Information Model (ASIM) documentation for more details. These fields include, but are not limited to usernames, IP addresses, IDs, hostnames, etc.

Logs format Guidance

Raw logs (directly from the source)

The format for the file that will contain raw data varies depending on the type of connector. The format for the file can be json (for API based Data Connector) / text (.txt) file (for Syslog/CEF based data Connectors) with the column names / property names adhering to the data type property names.

Below is a sample of the CEF formatted logs in their raw form:

 Mar 20 10:12:18 192.168.1.5 CEF: 0|Check Point|VPN-1 & FireWall-1|Check Point|geo_protection|Log|Unknown|act=Drop cs3Label=Protection Type cs3=geo_protection deviceDirection=0 rt=1584698718000 spt=58429 dpt=27016 ifname=eth0 logid=65536 loguid={0x5e74955f,0x0,0x501a8c0,0x19633097} origin=192.168.1.5 originsicname=cn=cp_mgmt,o=FlemingGW..y76ath sequencenum=2 version=5 dst=192.168.1.5 dst_country=Internal inspection_information=Geo-location inbound enforcement inspection_profile=Default Geo Policy product=VPN-1 & FireWall-1 proto=17 src=123.113.101.36 src_country=Other 
 Mar 20 10:12:19 192.168.1.5 CEF: 0|Check Point|VPN-1 & FireWall-1|Check Point|geo_protection|Log|Unknown|act=Drop cs3Label=Protection Type cs3=geo_protection deviceDirection=0 rt=1584698718000 spt=58429 dpt=27019 ifname=eth0 logid=65536 loguid={0x5e749560,0x0,0x501a8c0,0x19633097} origin=192.168.1.5 originsicname=cn=cp_mgmt,o=FlemingGW..y76ath sequencenum=3 version=5 dst=192.168.1.5 dst_country=Internal inspection_information=Geo-location inbound enforcement inspection_pro^C

Below is a sample of a syslog message in its raw form:

 <165>1 2003-10-11T22:14:15.003Z mymachine.example.com evntslog - ID47 [exampleSDID@32473 iut="3" eventSource="Application" eventID="1011"] BOMAn application event log entry.

Raw logs from API-based connectors can be extracted by leveraging an API usage platform (such as Postman) and using it to make an API call to the product and capturing a response. Below is a sample API response captured in its raw form:

[
  {
    "ts": "2020-03-20T16:00:10.144989Z",
    "eventType": "File Scanned",
    "clientName": "COMPUTER-M-V78J",
    "clientMac": "10:dd:b1:eb:88:f8",
    "clientIp": "192.168.128.2",
    "srcIp": "192.168.128.2",
    "destIp": "119.192.233.48",
    "protocol": "http",
    "uri": "http://www.favorite-icons.com/program/FavoriteIconsUninstall.exe",
    "canonicalName": "PUA.Win.Dropper.Kraddare::1201",
    "destinationPort": 80,
    "fileHash": "3ec1b9a95fe62aa25fc959643a0f227b76d253094681934daaf628d3574b3463",
    "fileType": "MS_EXE",
    "fileSizeBytes": 193688,
    "disposition": "Malicious",
    "action": "Blocked"
  },
  {
    "ts": "2022-03-08T01:18:30.072163Z",
    "eventType": "IDS Alert",
    "deviceMac": "ac:17:c8:21:1c:70",
    "clientMac": "",
    "srcIp": "45.137.23.246:42101",
    "destIp": "84.14.28.183:9034",
    "protocol": "udp/ip",
    "priority": "1",
    "classification": "9",
    "blocked": false,
    "message": "SERVER-OTHER RealTek UDPServer command injection attempt",
    "signature": "1:58853:1",
    "sigSource": "ids-vrt-balanced",
    "ruleId": "meraki:intrusion/snort/GID/1/SID/58853"
  }
]

Post-ingestion logs

The post-ingestion logs are exported from log analytics using the Export option in the query window. The format of the file will be csv as exported from Log Analytics JSON irrespective of the data connector type. These logs are important in helping in understanding how the information from raw logs has been mapped to fields.

Schema

The schema, similar to post-ingestion logs can be exported from log analytics using the Export option in the query window. The exported file is a csv. This is important to understand the schema of the table that the logs are ingested in.

Log Extraction Guidance

Extracting ingested logs from Log Analytics Workspace

Ingested logs can be extracted by running a KQL query in the Logs window in Microsoft Sentinel/Log Analytics Workspace. Typing a basic query to get all all logs ingested by a Data Connector will get you the logs along with the defined schema. After you run the query, click on Export and then click Export to CSV - all columns.

ExportToCSV

Extracting raw logs for CEF/Syslog based connectors

We have several ways to capture the original data that comes from syslog devices and that is getting ingested into syslog-ng or rsyslog sever. One of the way is to capture the traces on syslog-ng or rsyslog server over 514 port. You can use the following command to captre the traffic into pacp file

sudo tcpdump -s 0 -Ani any port 514 -vv -w /var/log/syslog.pcap

image

Once we have the pcap file, we can visualize the events using utility "tcpick" and export into readable format

tcpick -C -yP -r syslog.pcap > sampledata.log
nano sampledata.log

image

Extracting the schema

To extract the schema of the table in a csv file, run the following query in a log analytics query window:

TableName | getschema

Note: Replace "TableName" in the above query with the actual name of the table before executing it in Log Analytics. This will return the schema of the table which can then be exported to a csv file using the Export option as described above for post-ingested logs.

ExportSchemaToCSV

Sample data upload to GitHub

Once you've gathered all three files, submit them via a GitHub PR. All three files must reside inside a folder called "Sample Data" within the Solution folder. Example folder structure - "Azure-Sentinel/Solutions//Sample Data/".

Important: Please ensure all sample data has been scrubbed to remove all sensitive PII information that may exist in the logs. The intent is to understand the "what" and "how" from the logs not the "who".