A regular expression is a “prefix expression” if it starts with a caret (^) or a left anchor ( A), followed by a string of simple symbols. For example, the regex /^abc./ will be optimized by matching only against the values from the index that start with abc. Regular expressions are used when you want to search for specify lines of text containing a particular pattern. Most of the UNIX utilities operate.
PermalinkJoin GitHub today
GitHub is home to over 40 million developers working together to host and review code, manage projects, and build software together.
Sign up Find file Copy path
Cannot retrieve contributors at this time
'' |
FindInstructions.py: A script to help you find desired opcodes/instructions in a database |
The script accepts opcodes and assembly statements (which will be assembled) separated by semicolon |
The general syntax is: |
find(asm or opcodes, x=Bool, asm_where=ea) |
* Example: |
find('asm_statement1;asm_statement2;de ea dc 0d e0;asm_statement3;xx yy zz;...') |
* To filter-out non-executable segments pass x=True |
find('jmp dword ptr [esp]', x=True) |
* To specify in which context the instructions should be assembled, pass asm_where=ea: |
find('jmp dword ptr [esp]', asm_where=here()) |
Copyright (c) 1990-2009 Hex-Rays |
ALL RIGHTS RESERVED. |
v1.0 - initial version |
'' |
import idaapi |
import idautils |
import idc |
# ----------------------------------------------------------------------- |
defFindInstructions(instr, asm_where=None): |
'' |
Finds instructions/opcodes |
@return: Returns a tuple(True, [ ea, ... ]) or a tuple(False, 'error message') |
'' |
ifnot asm_where: |
# get first segment |
asm_where = FirstSeg() |
if asm_where idaapi.BADADDR: |
return (False, 'No segments defined') |
# regular expression to distinguish between opcodes and instructions |
re_opcode = re.compile('^[0-9a-f]{2} *', re.I) |
# split lines |
lines = instr.split(';') |
# all the assembled buffers (for each instruction) |
bufs = [] |
for line in lines: |
if re_opcode.match(line): |
# convert from hex string to a character list then join the list to form one string |
buf =''.join([chr(int(x, 16)) for x in line.split()]) |
else: |
# assemble the instruction |
ret, buf = Assemble(asm_where, line) |
ifnot ret: |
return (False, 'Failed to assemble:'+line) |
# add the assembled buffer |
bufs.append(buf) |
# join the buffer into one string |
buf =''.join(bufs) |
# take total assembled instructions length |
tlen =len(buf) |
# convert from binary string to space separated hex string |
bin_str =''.join(['%02X'%ord(x) for x in buf]) |
# find all binary strings |
print'Searching for: [%s]'% bin_str |
ea = MinEA() |
ret = [] |
whileTrue: |
ea = FindBinary(ea, SEARCH_DOWN, bin_str) |
if ea idaapi.BADADDR: |
break |
ret.append(ea) |
Message('.') |
ea += tlen |
ifnot ret: |
return (False, 'Could not match [%s]'% bin_str) |
Message('n') |
return (True, ret) |
# ----------------------------------------------------------------------- |
# Chooser class |
classSearchResultChoose(Choose): |
def__init__(self, list, title): |
Choose.__init__(self, list, title) |
self.width =250 |
defenter(self, n): |
o =self.list[n-1] |
Jump(o.ea) |
# ----------------------------------------------------------------------- |
# class to represent the results |
classSearchResult: |
def__init__(self, ea): |
self.ea = ea |
ifnot isCode(GetFlags(ea)): |
MakeCode(ea) |
t = idaapi.generate_disasm_line(ea) |
if t: |
line = idaapi.tag_remove(t) |
else: |
line ='' |
func = GetFunctionName(ea) |
self.display =hex(ea) +': ' |
if func: |
self.display += func +': ' |
else: |
n = SegName(ea) |
if n: self.display += n +': ' |
self.display += line |
def__str__(self): |
returnself.display |
# ----------------------------------------------------------------------- |
deffind(s=None, x=False, asm_where=None): |
b, ret = FindInstructions(s, asm_where) |
if b: |
# executable segs only? |
if x: |
results = [] |
for ea in ret: |
seg = idaapi.getseg(ea) |
if (not seg) or (seg.perm & idaapi.SEGPERM_EXEC) 0: |
continue |
results.append(SearchResult(ea)) |
else: |
results = [SearchResult(ea) for ea in ret] |
title ='Search result for: [%s]'% s |
idaapi.close_chooser(title) |
c = SearchResultChoose(results, title) |
c.choose() |
else: |
print ret |
# ----------------------------------------------------------------------- |
print'Please use find('asm_stmt1;xx yy;...', x=Bool,asm_where=ea) to search for instructions or opcodes. Specify x=true to filter out non-executable segments' |
Copy lines Copy permalink
Active8 months ago
It would be very useful if Google provided a regular expression search.
Is there a way to do this?
(OBS: I am not talking about false regular expressions like
site:
, filetype:
, AND
, OR
or 'Text'
. I would like to search with a regular expression like .+[]^
).For example, is there an application, a site or a Google tool to search things like
Peter Mortensen*.stackexchange
?1,40922 gold badges1717 silver badges3030 bronze badges
GarouDanGarouDan
migrated from webmasters.stackexchange.comOct 8 '11 at 22:00
This question came from our site for pro webmasters.
9 Answers
This feature is not available in classic Google Search and it's not in Google's roadmap. You can learn more about this topic watching the Google video Will Google implement the ability to search with regular expressions?
However, there's one exception. Google Code Search supports regular expressions. Of course, the search target for this topic search engine is reduced to source code only.
It is worth to mention that some Google search keywords can partially replace regular expressions. For example, if you want to search any two-word variation of 'search TERM', you can use the wildcard operator.
will find results for search and any other (one) word. I often use it to check basic English grammar rules or synonyms (e.g., 'as easy as *').
ale43.5k2525 gold badges118118 silver badges261261 bronze badges
Simone CarlettiSimone Carletti
Google Search can return the matches of some simple regular expressions. For example, the search query
appears to be equivalent to
You can see the output of this search query here.
Anderson GreenAnderson Green1,25511 gold badge1818 silver badges3737 bronze badges
SymbolHound has an open source code repository search, similar to the now-discontinued Google Code Search option, in addition to a symbol-inclusive web search that indexes programming-related sites such as Stackoverflow.
Alex20.7k1010 gold badges6868 silver badges9494 bronze badges
TomTom
You can write a piece of software to:
- Take the keywords from the regular expression;
- Google the keywords and get a list of results;
- Crawl each resulting URI and filter it with complete regular expression.
Let's study a case: from
site:gog.com
find all games that have Spanish voice-over.The regular expression is:
It shall match, for example:
And not match:
Step 1. Let your software search this on Google:
inurl:game
here means only search in game description pagesStep 2. Get the 300 resulting links and crawl into every one of them.
Step 3. Filter the result with the given regular expression:
This should be easy to build. In fact I don't understand why I couldn't find something that is already built that way.
Since search engines can't afford the resource to scan their data with regular expression, this dirty job falls on your part, and your computer should do that with what search engines already provide.
Alex20.7k1010 gold badges6868 silver badges9494 bronze badges
Tankman六四Tankman六四
No, unfortunately not :(. In theory you could make your own search engine and do it, but that would be pretty hard.
Alex20.7k1010 gold badges6868 silver badges9494 bronze badges
invisible bobinvisible bob
Just for reference, Google's help on search operators is here.
Interestingly, '-' is still an operator for word exclusion, but they removed '+' as an operator, used in the past to require a given search term. Apparently, 'The + operator was retired when Google+ was launched, because + was needed as a searchable character rather than an operator.'(https://support.google.com/websearch/answer/2466433)
ale43.5k2525 gold badges118118 silver badges261261 bronze badges
ludinomludinom
You could start with a detailed Google search to cull the target text to search. Then open, say, the top 50 results in multiple tabs and use mingyi's 'Fastest Search' Firefox addon to search the results using a regular expression.
Peter Mortensen1,40922 gold badges1717 silver badges3030 bronze badges
user60402user60402
Google now supports and fully documents the use of RegEx. Here is the link for reference:
Hernan PelassiniHernan Pelassini
If you know VBA, you can write some code to get data from the web to Excel. I run the program day and night and can get millions of results. After that you can filter from those results.
ale43.5k2525 gold badges118118 silver badges261261 bronze badges
Nguyen Kieng HiepNguyen Kieng Hiep