Python-Iocextract - Advanced Indicator Of Compromise (Ioc) Extractor
Monday, September 9, 2019
Edit
Python-Iocextract - Advanced Indicator Of Compromise (Ioc) Extractor - Hi friends mederc, In the article that you read this time with the title Python-Iocextract - Advanced Indicator Of Compromise (Ioc) Extractor, We have prepared this article well for you to read and retrieve information from it. hopefully fill the posts 
Article Decoding,
Article Indicators of Compromise,
Article IOC Extractor,
Article Malware Research,
Article Python-Iocextract,
Article Threat Intelligence,
Article Threat Sharing,
Article Threatintel, we write this you can understand. Alright, happy reading.
Title : Python-Iocextract - Advanced Indicator Of Compromise (Ioc) Extractor
link : Python-Iocextract - Advanced Indicator Of Compromise (Ioc) Extractor
Advanced Indicator of Compromise (IOC) extractor.
 
   
Overview
This library extracts URLs, IP addresses, MD5/SHA hashes, electronic mail addresses, in addition to YARA rules from text corpora. It includes around encoded in addition to "defanged" IOCs inwards the output, in addition to optionally decodes/refangs them.
 
   
The Problem
It is mutual do for malware analysts or endpoint software to "defang" IOCs such equally URLs in addition to IP addresses, inwards guild to forestall accidental exposure to alive malicious content. Being able to extract in addition to aggregate these IOCs is oft valuable for analysts. Unfortunately, existing "IOC extraction" tools oft overstep correct yesteryear them, equally they are non caught yesteryear criterion regex.
For example, the unproblematic defanging technique of surrounding periods alongside brackets:
 
 
   
The Solution
By combining especially crafted regex alongside around custom postprocessing, nosotros are able to both uncovering in addition to deobfuscate "defanged" IOCs. This saves fourth dimension in addition to endeavour for the analyst, who mightiness otherwise accept to manually uncovering in addition to convert IOCs into machine-readable format.
 
   
A Simple Use Case
Many Twitter users post C2s or other valuable IOC information alongside defanged URLs. For example, this tweet from @InQuest:
 
 
 
   
Installation
You may involve to install the Python evolution headers inwards guild to install the
 
 
 
 
Usage
Try extracting around defanged URLs:
 
If you lot want, you lot tin also "refang", or take mutual obfuscation methods from IOCs:
 
 
 
 
 
   
Should I Use iocextract?
Are you...
Extracting possibly-defanged IOCs from patently text, similar the contents of tweets or weblog posts?
Yes! This is just what iocextract was designed for, in addition to where it performs best. Want to instruct a footstep further in addition to automate extraction in addition to storage? Check out ThreatIngestor.
Extracting URLs that accept been hex or base64 encoded?
Yes, but the CLI mightiness non plow over you lot the best results. Try writing a Python script in addition to calling
Note that you lot volition virtually probable terminate upwards alongside extra garbage at the terminate of URLs.
Extracting IOCs that accept non been defanged, from HTML/XML/RTF?
Maybe, but you lot should regard using the
If you're extracting from HTML, regard using something similar Beautiful Soup to showtime isolate the text content, in addition to and then overstep that to iocextract, like this.
Extracting IOCs that accept non been defanged, from binary information similar executables, or really large inputs?
Probably not. The regex inwards iocextract is designed to live on flexible to select grip of defanged IOCs, therefore it performs significantly worse than a solution that is designed to select grip of simply criterion IOCs.
Consider using something similar Cacador instead.
   
 
 
 
You are now reading the article Python-Iocextract - Advanced Indicator Of Compromise (Ioc) Extractor with the link address https://mederc.blogspot.com/2019/09/python-iocextract-advanced-indicator-of.html
Title : Python-Iocextract - Advanced Indicator Of Compromise (Ioc) Extractor
link : Python-Iocextract - Advanced Indicator Of Compromise (Ioc) Extractor
Python-Iocextract - Advanced Indicator Of Compromise (Ioc) Extractor
Advanced Indicator of Compromise (IOC) extractor.
Overview
This library extracts URLs, IP addresses, MD5/SHA hashes, electronic mail addresses, in addition to YARA rules from text corpora. It includes around encoded in addition to "defanged" IOCs inwards the output, in addition to optionally decodes/refangs them.
The Problem
It is mutual do for malware analysts or endpoint software to "defang" IOCs such equally URLs in addition to IP addresses, inwards guild to forestall accidental exposure to alive malicious content. Being able to extract in addition to aggregate these IOCs is oft valuable for analysts. Unfortunately, existing "IOC extraction" tools oft overstep correct yesteryear them, equally they are non caught yesteryear criterion regex.
For example, the unproblematic defanging technique of surrounding periods alongside brackets:
127[.]0[.]0[.]1The Solution
By combining especially crafted regex alongside around custom postprocessing, nosotros are able to both uncovering in addition to deobfuscate "defanged" IOCs. This saves fourth dimension in addition to endeavour for the analyst, who mightiness otherwise accept to manually uncovering in addition to convert IOCs into machine-readable format.
A Simple Use Case
Many Twitter users post C2s or other valuable IOC information alongside defanged URLs. For example, this tweet from @InQuest:
Recommended reading in addition to cracking piece of employment from @unit42_intel: https://researchcenter.paloaltonetworks.com/2018/02/unit42-sofacy-attacks-multiple-government-entities/ ... InQuest customers accept had detection for threats delivered from hotfixmsupload[.]com since 6/3/2017 in addition to cdnverify[.]net since 2/1/18.https://researchcenter.paloaltonetworks.com/2018/02/unit42-sofacy-attacks-multiple-government-entities/ hotfixmsupload[.]com cdnverify[.]netrefang=True at extraction fourth dimension would take the obfuscation, but  since these are existent IOCs, let's leave of absence them defanged inwards our documentation. :)Installation
You may involve to install the Python evolution headers inwards guild to install the
regex dependency. On Ubuntu/Debian-based systems, try:sudo apt-get install python-deviocextract from pip:pip install iocextractregex conduct  yesteryear downloading the appropriate cycle from PyPI in addition to running e.g.:pip install regex-2018.06.21-cp27-none-win_amd64.whlUsage
Try extracting around defanged URLs:
>>> content = """ ... I actually honey example[.]com! ... All the bots are on hxxp://example.com/bad/url these days. ... C2: tcp://example[.]com:8989/bad ... """ >>> import iocextract >>> for url inwards iocextract.extract_urls(content): ...     impress url ... hxxp://example.com/bad/url tcp://example[.]com:8989/bad example[.]com tcp://example[.]com:8989/badIf you lot want, you lot tin also "refang", or take mutual obfuscation methods from IOCs:
>>> for url inwards iocextract.extract_urls(content, refang=True): ...     impress url ... http://example.com/bad/url http://example.com:8989/bad http://example.com http://example.com:8989/bad>>> content = '612062756e6368206f6620776f72647320687474703a2f2f6578616d706c652e636f6d2f70617468206d6f726520776f726473' >>> for url inwards iocextract.extract_urls(content): ...     impress url ... 687474703a2f2f6578616d706c652e636f6d2f70617468 >>> for url inwards iocextract.extract_urls(content, refang=True): ...     impress url ... http://example.com/pathextract_* functions inwards this library provide iterators, non lists. The  produce goodness of this demeanour is that iocextract tin procedure extremely large  inputs, alongside a really depression overhead. However, if for around argue you lot involve to iterate  over the IOCs to a greater extent than than once, you lot volition accept to relieve the results equally a list:>>> list(iocextract.extract_urls(content)) ['hxxp://example.com/bad/url', 'tcp://example[.]com:8989/bad', 'example[.]com', 'tcp://example[.]com:8989/bad']$ iocextract -h usage: iocextract [-h] [--input INPUT] [--output OUTPUT] [--extract-emails]               [--extract-ips] [--extract-ipv4s] [--extract-ipv6s]               [--extract-urls] [--extract-yara-rules] [--extract-hashes]               [--custom-regex REGEX_FILE] [--refang] [--strip-urls]               [--wide]  Advanced Indicator of Compromise (IOC) extractor. If no arguments are specified, the default demeanour is to extract all IOCs.  optional arguments:   -h, --help            exhibit this attention message in addition to leave of absence   --input INPUT         default: stdin   --output OUTPUT       default: stdout   --extract-emails   --extract-ips   --extract-ipv4s   --extract-ipv6s   --extract-urls   --extract-yara-rules   --extract-hashes   --custom-regex REGEX_FILE                         file alongside custom regex strings, 1 per line, alongside 1                         capture grouping each   --refang              default: no   --strip-urls          take possible garbage from the terminate of urls. default:                         no   --wide                preprocess input to allow wide-encoded graphic symbol                         matches. default: noShould I Use iocextract?
Are you...
Extracting possibly-defanged IOCs from patently text, similar the contents of tweets or weblog posts?
Yes! This is just what iocextract was designed for, in addition to where it performs best. Want to instruct a footstep further in addition to automate extraction in addition to storage? Check out ThreatIngestor.
Extracting URLs that accept been hex or base64 encoded?
Yes, but the CLI mightiness non plow over you lot the best results. Try writing a Python script in addition to calling
iocextract.extract_encoded_urls directly.Note that you lot volition virtually probable terminate upwards alongside extra garbage at the terminate of URLs.
Extracting IOCs that accept non been defanged, from HTML/XML/RTF?
Maybe, but you lot should regard using the
--strip-urls CLI flag (or the  strip=True parameter inwards the library), in addition to you lot may nevertheless instruct around extra  garbage inwards your output.If you're extracting from HTML, regard using something similar Beautiful Soup to showtime isolate the text content, in addition to and then overstep that to iocextract, like this.
Extracting IOCs that accept non been defanged, from binary information similar executables, or really large inputs?
Probably not. The regex inwards iocextract is designed to live on flexible to select grip of defanged IOCs, therefore it performs significantly worse than a solution that is designed to select grip of simply criterion IOCs.
Consider using something similar Cacador instead.
Thus the article Python-Iocextract - Advanced Indicator Of Compromise (Ioc) Extractor
That's all the article Python-Iocextract - Advanced Indicator Of Compromise (Ioc) Extractor this time, hopefully can benefit you all. okay, see you in another article posting.
You are now reading the article Python-Iocextract - Advanced Indicator Of Compromise (Ioc) Extractor with the link address https://mederc.blogspot.com/2019/09/python-iocextract-advanced-indicator-of.html
