Tue 30 Mar 2010
Oracle sql_id and hash value
Posted by Slavik under Oracle, technical tips
[17] Comments
Tanel published a great post a while ago talking about Oracle’s sql_id and hash values in Oracle 10g+. I wanted to be able to compute sql_id and hash values directly from SQL statements for our Hedgehog product. I did a few tests and could not match the MD5 value generated from the SQL statement to the MD5 value Oracle is calculating in X$KGLOB.KGLNAHSV. After a short discussion with Tanel, it turned out that Oracle is appending a NULL (‘\0′) value to the statement and then calculates the MD5.
Here is a test and some code in Python:
SYS> select 'Slavik' from dual;
'SLAVI ------ Slavik
SYS> select kglnahsv, kglnahsh from x$kglob where kglnaobj = 'select ''Slavik'' from dual';
KGLNAHSV KGLNAHSH
--------------------------------- ----------
7a483e90555ab4ad24e190abe3e7775d 3823597405
7a483e90555ab4ad24e190abe3e7775d 3823597405
SYS> select sql_id, hash_value, old_hash_value from v$sql where sql_text =
'select ''Slavik'' from dual';
SQL_ID HASH_VALUE OLD_HASH_VALUE
------------- ---------- --------------
29schpgjyfxux 3823597405 3501236764
So, first, let's check that our MD5 matches:
>>> import hashlib
>>> import math
>>> import struct
>>> stmt = "select 'Slavik' from dual"
>>> d = hashlib.md5(stmt + '\x00').digest()
>>> struct.unpack('IIII', d)[3]
3823597405
>>> h = ''
>>> for i in struct.unpack('IIII', d):
h += hex(i)[2:]
>>> h
'7a483e90555ab4ad24e190abe3e7775d'
Good, all seem to match!
Now, let's create some utility functions:
def sqlid_2_hash(sqlid):
sum = 0
i = 1
alphabet = '0123456789abcdfghjkmnpqrstuvwxyz'
for ch in sqlid:
sum += alphabet.index(ch) * (32**(len(sqlid) - i))
i += 1
return sum % (2 ** 32)
def stmt_2_sqlid(stmt):
h = hashlib.md5(stmt + '\x00').digest()
(d1,d2,msb,lsb) = struct.unpack('IIII', h)
sqln = msb * (2 ** 32) + lsb
stop = math.log(sqln, math.e) / math.log(32, math.e) + 1
sqlid = ''
alphabet = '0123456789abcdfghjkmnpqrstuvwxyz'
for i in range(0, stop):
sqlid = alphabet[(sqln / (32 ** i)) % 32] + sqlid
return sqlid
def stmt_2_hash(stmt):
return struct.unpack('IIII', hashlib.md5(stmt + '\x00').digest())[3]
Let's try them...
>>> stmt_2_hash(stmt)
3823597405
>>> stmt_2_sqlid(stmt)
'29schpgjyfxux'
>>> sqlid_2_hash(stmt_2_sqlid(stmt))
3823597405
Well, it all works. Now, to the real programming…
17 Responses to “ Oracle sql_id and hash value ”
Trackbacks & Pingbacks:
-
[...] 根据Slavik 的帖子Oracle sql_id and hash value可知,Oracle输入的SQL语句的hash_value与SQL_id 是根据SQL [...]
Good Tips!!
thanks for sharing, that’s a nice one..
BTW I have recently discovered that in 11.2.0.1 you can use dbms_sqltune_util0.sqltext_to_sqlid too. Ex:
select dbms_sqltune_util0.sqltext_to_sqlid(‘select sysdate from dual’||chr(0)) sql_id from dual;
well..actually I didn’t know about adding the chr(0) at the end since I read your post
Cheers,
L.
very cool!
thanks
(link your post at http://db-optimizer.blogspot.com/2010/03/calculating-sqlid-hash-from-sqltext.html )
thank you for cool information!
Unfortunatelly I don’t understand ‘python’ish. Could you translate it to PL/SQL, please?
Thanks in advance,
M.
great research work!
Is there any way for someone not on your database to compute the SQL text when given the SQL_ID?
@SLF
Not sure what you are asking. Are you asking if someone can take the SQL_ID and get the statement text from it? If so, then the answer no.
Ohh… Really good
One more..
I am working on 10g platform but it is saying ‘x$kglob’ does not exist.
What’s the reason??
Send me an email.
Hi Avinash,
Did you login as ‘sys’?
Hi Slavik,
Great job! I suggest that you take database characterset into consideration to make a more generalized program to caculate the SQL_ID. =)
Linked your cool post.
Todd
Many thanks for the example, but found a small omission; Oracle will left pad the resulting sqlid with a ’0′ to make the sqlid 13 char if it happens to be smaller than 32^13.
On 32 bit machines the algorithm can sometimes provide ‘close’ but wrong results due to rounding errors. Using bitwise operations instead of direct match solves that.
Also, for some PL/SQL code Oracle seems to do the md5 on something else than it reports in v$sql.sql_fulltext – sofar only seen that happen on pl/sql that is executed from dbms_job, but this might be coincidence. Still trying to fiddle out what happens there.
thx!
@Jan-Marten
Good catch! If you’ve created an updated code, do publish
i am not a python, but a C/C++ guy
std::string MD5::orasqlid( const unsigned char *sql, size_t len )
{
// calculate 128 bit MD5 hash on input
update( sql, len );
// oracle feeds an extra zero after the string.
unsigned char zero = 0;
update( &zero, 1 );
finalize();
std::string res = “”;
//consider only last/most significant 64 bits of regular 128bit MD5 hash
uint4 *msb = (uint4*)(&(digest[8]));
uint4 *lsb = (uint4*)(&(digest[12])); // these bits are the traditional 32 bit hash value
unsigned long long value = (unsigned long long)(*msb) * 4294967296ULL + (unsigned long long)(*lsb);
char* base32 = “0123456789abcdfghjkmnpqrstuvwxyz”;
if (value == 0)
{
res = “0″;
} else
{
while (value )
{
res = base32[value % 32] + res;
value = value / 32;
}
//left pad with zero to 13 chars
while ( res.length() < 13 )
{
res = "0" + res;
}
}
return res;
}
A previous poster requested pl/sql versions of the functions…
It’s a few months late, but hopefully these will help.
CREATE OR REPLACE FUNCTION sqltext_to_hash(p_sql IN VARCHAR2)
RETURN NUMBER
IS
– Author: Sean Stuber
– based on algorithm described by Tanel Poder here
– http://blog.tanelpoder.com/2009/02/22/sql_id-is-just-a-fancy-representation-of-hash-value/
BEGIN
— Generate md5 hash of the sql text
— Oracle appends a null-char (0) before hashing.
— pull the 4th 32bit int from the md5, reverse it to fix endianness
— convert the hex representation of the 32bits into a number and return
RETURN TO_NUMBER(
RAWTOHEX(
UTL_RAW.reverse(
UTL_RAW.SUBSTR(
DBMS_CRYPTO.hash(UTL_I18N.string_to_raw(p_sql || CHR(0), ‘AL32UTF8′),
DBMS_CRYPTO.hash_md5
),
13,
4))),
‘xxxxxxxx’);
END;
CREATE OR REPLACE FUNCTION sqltext_to_sqlid(p_sql IN VARCHAR2)
RETURN VARCHAR2
IS
— Author: Sean Stuber
— based on algorithm described by Tanel Poder here
— http://blog.tanelpoder.com/2009/02/22/sql_id-is-just-a-fancy-representation-of-hash-value/
— Generate md5 hash of the sql text
— Oracle appends a null-char (0) before hashing
v_md5 RAW(16)
:= DBMS_CRYPTO.hash(UTL_I18N.string_to_raw(p_sql || CHR(0), ‘AL32UTF8′),
DBMS_CRYPTO.hash_md5
);
— pull the last 64bits from the md5
— reverse each of the 32bit words to fix endianness
— convert the 16 digit hex string to a number
v_temp NUMBER
:= TO_NUMBER(
RAWTOHEX(UTL_RAW.reverse(UTL_RAW.SUBSTR(v_md5, 9, 4)))
|| RAWTOHEX(UTL_RAW.reverse(UTL_RAW.SUBSTR(v_md5, 13, 4))),
RPAD(‘x’, 16, ‘x’));
v_sqlid VARCHAR2(13);
BEGIN
— Convert number to base32 string
— use alphabet defined in Tanel Poder’s sqlid to hash query
FOR i IN 0 .. FLOOR(LN(v_temp) / LN(32))
LOOP
v_sqlid :=
SUBSTR(’0123456789abcdfghjkmnpqrstuvwxyz’,
FLOOR(MOD(v_temp / POWER(32, i), 32)) + 1,
1
)
|| v_sqlid;
END LOOP;
RETURN LPAD(v_sqlid, 13, ’0′);
END;
CREATE OR REPLACE FUNCTION sqlid_to_hash(p_sqlid IN VARCHAR2)
RETURN NUMBER
IS
— Author: Sean Stuber
— based on algorithm described by Tanel Poder here
— http://blog.tanelpoder.com/2009/02/22/sql_id-is-just-a-fancy-representation-of-hash-value/
v_sqlid VARCHAR2(20) := TRIM(p_sqlid);
v_hash NUMBER;
BEGIN
— Convert base32 string into a number
SELECT SUM(
(INSTR(’0123456789abcdfghjkmnpqrstuvwxyz’, SUBSTR(LOWER(v_sqlid), LEVEL, 1)) – 1)
* POWER(32, LENGTH(v_sqlid) – LEVEL))
INTO v_hash
FROM DUAL
CONNECT BY LEVEL <= LENGTH(v_sqlid);
— Hash value is only the low 32 bits
RETURN BITAND(v_hash, POWER(2, 32) – 1);
END;
/