Tue 30 Mar 2010
Oracle sql_id and hash value
Posted by Slavik under Oracle, technical tips
[18] Comments
Tanel published a great post a while ago talking about Oracle’s sql_id and hash values in Oracle 10g+. I wanted to be able to compute sql_id and hash values directly from SQL statements for our Hedgehog product. I did a few tests and could not match the MD5 value generated from the SQL statement to the MD5 value Oracle is calculating in X$KGLOB.KGLNAHSV. After a short discussion with Tanel, it turned out that Oracle is appending a NULL (‘\0’) value to the statement and then calculates the MD5.
Here is a test and some code in Python:
SYS> select 'Slavik' from dual;
'SLAVI  Slavik
SYS> select kglnahsv, kglnahsh from x$kglob where kglnaobj = 'select ''Slavik'' from dual';
KGLNAHSV KGLNAHSH   7a483e90555ab4ad24e190abe3e7775d 3823597405 7a483e90555ab4ad24e190abe3e7775d 3823597405 SYS> select sql_id, hash_value, old_hash_value from v$sql where sql_text = 'select ''Slavik'' from dual'; SQL_ID HASH_VALUE OLD_HASH_VALUE    29schpgjyfxux 3823597405 3501236764 So, first, let's check that our MD5 matches: >>> import hashlib >>> import math >>> import struct >>> stmt = "select 'Slavik' from dual" >>> d = hashlib.md5(stmt + '\x00').digest() >>> struct.unpack('IIII', d)[3] 3823597405 >>> h = '' >>> for i in struct.unpack('IIII', d): h += hex(i)[2:] >>> h '7a483e90555ab4ad24e190abe3e7775d' Good, all seem to match! Now, let's create some utility functions: def sqlid_2_hash(sqlid): sum = 0 i = 1 alphabet = '0123456789abcdfghjkmnpqrstuvwxyz' for ch in sqlid: sum += alphabet.index(ch) * (32**(len(sqlid)  i)) i += 1 return sum % (2 ** 32) def stmt_2_sqlid(stmt): h = hashlib.md5(stmt + '\x00').digest() (d1,d2,msb,lsb) = struct.unpack('IIII', h) sqln = msb * (2 ** 32) + lsb stop = math.log(sqln, math.e) / math.log(32, math.e) + 1 sqlid = '' alphabet = '0123456789abcdfghjkmnpqrstuvwxyz' for i in range(0, stop): sqlid = alphabet[(sqln / (32 ** i)) % 32] + sqlid return sqlid def stmt_2_hash(stmt): return struct.unpack('IIII', hashlib.md5(stmt + '\x00').digest())[3] Let's try them... >>> stmt_2_hash(stmt) 3823597405 >>> stmt_2_sqlid(stmt) '29schpgjyfxux' >>> sqlid_2_hash(stmt_2_sqlid(stmt)) 3823597405
Well, it all works. Now, to the real programming…
18 Responses to “ Oracle sql_id and hash value ”
Trackbacks & Pingbacks:

[…] 根据Slavik 的帖子Oracle sql_id and hash value可知,Oracle输入的SQL语句的hash_value与SQL_id 是根据SQL […]
Good Tips!!
thanks for sharing, that’s a nice one..
BTW I have recently discovered that in 11.2.0.1 you can use dbms_sqltune_util0.sqltext_to_sqlid too. Ex:
select dbms_sqltune_util0.sqltext_to_sqlid(‘select sysdate from dual’chr(0)) sql_id from dual;
well..actually I didn’t know about adding the chr(0) at the end since I read your post 🙂
Cheers,
L.
very cool!
thanks
(link your post at http://dboptimizer.blogspot.com/2010/03/calculatingsqlidhashfromsqltext.html )
thank you for cool information!
Unfortunatelly I don’t understand ‘python’ish. Could you translate it to PL/SQL, please?
Thanks in advance,
M.
great research work!
Is there any way for someone not on your database to compute the SQL text when given the SQL_ID?
@SLF
Not sure what you are asking. Are you asking if someone can take the SQL_ID and get the statement text from it? If so, then the answer no.
Ohh… Really good
One more..
I am working on 10g platform but it is saying ‘x$kglob’ does not exist.
What’s the reason??
Send me an email.
Hi Avinash,
Did you login as ‘sys’?
Hi Slavik,
Great job! I suggest that you take database characterset into consideration to make a more generalized program to caculate the SQL_ID. =)
Linked your cool post.
Todd
Many thanks for the example, but found a small omission; Oracle will left pad the resulting sqlid with a ‘0’ to make the sqlid 13 char if it happens to be smaller than 32^13.
On 32 bit machines the algorithm can sometimes provide ‘close’ but wrong results due to rounding errors. Using bitwise operations instead of direct match solves that.
Also, for some PL/SQL code Oracle seems to do the md5 on something else than it reports in v$sql.sql_fulltext – sofar only seen that happen on pl/sql that is executed from dbms_job, but this might be coincidence. Still trying to fiddle out what happens there.
thx!
@JanMarten
Good catch! If you’ve created an updated code, do publish 🙂
i am not a python, but a C/C++ guy 🙂
std::string MD5::orasqlid( const unsigned char *sql, size_t len )
{
// calculate 128 bit MD5 hash on input
update( sql, len );
// oracle feeds an extra zero after the string.
unsigned char zero = 0;
update( &zero, 1 );
finalize();
std::string res = “”;
//consider only last/most significant 64 bits of regular 128bit MD5 hash
uint4 *msb = (uint4*)(&(digest[8]));
uint4 *lsb = (uint4*)(&(digest[12])); // these bits are the traditional 32 bit hash value
unsigned long long value = (unsigned long long)(*msb) * 4294967296ULL + (unsigned long long)(*lsb);
char* base32 = “0123456789abcdfghjkmnpqrstuvwxyz”;
if (value == 0)
{
res = “0”;
} else
{
while (value )
{
res = base32[value % 32] + res;
value = value / 32;
}
//left pad with zero to 13 chars
while ( res.length() < 13 )
{
res = "0" + res;
}
}
return res;
}
A previous poster requested pl/sql versions of the functions…
It’s a few months late, but hopefully these will help.
CREATE OR REPLACE FUNCTION sqltext_to_hash(p_sql IN VARCHAR2)
RETURN NUMBER
IS
— Author: Sean Stuber
— based on algorithm described by Tanel Poder here
— http://blog.tanelpoder.com/2009/02/22/sql_idisjustafancyrepresentationofhashvalue/
BEGIN
— Generate md5 hash of the sql text
— Oracle appends a nullchar (0) before hashing.
— pull the 4th 32bit int from the md5, reverse it to fix endianness
— convert the hex representation of the 32bits into a number and return
RETURN TO_NUMBER(
RAWTOHEX(
UTL_RAW.reverse(
UTL_RAW.SUBSTR(
DBMS_CRYPTO.hash(UTL_I18N.string_to_raw(p_sql  CHR(0), ‘AL32UTF8’),
DBMS_CRYPTO.hash_md5
),
13,
4))),
‘xxxxxxxx’);
END;
CREATE OR REPLACE FUNCTION sqltext_to_sqlid(p_sql IN VARCHAR2)
RETURN VARCHAR2
IS
— Author: Sean Stuber
— based on algorithm described by Tanel Poder here
— http://blog.tanelpoder.com/2009/02/22/sql_idisjustafancyrepresentationofhashvalue/
— Generate md5 hash of the sql text
— Oracle appends a nullchar (0) before hashing
v_md5 RAW(16)
:= DBMS_CRYPTO.hash(UTL_I18N.string_to_raw(p_sql  CHR(0), ‘AL32UTF8’),
DBMS_CRYPTO.hash_md5
);
— pull the last 64bits from the md5
— reverse each of the 32bit words to fix endianness
— convert the 16 digit hex string to a number
v_temp NUMBER
:= TO_NUMBER(
RAWTOHEX(UTL_RAW.reverse(UTL_RAW.SUBSTR(v_md5, 9, 4)))
 RAWTOHEX(UTL_RAW.reverse(UTL_RAW.SUBSTR(v_md5, 13, 4))),
RPAD(‘x’, 16, ‘x’));
v_sqlid VARCHAR2(13);
BEGIN
— Convert number to base32 string
— use alphabet defined in Tanel Poder’s sqlid to hash query
FOR i IN 0 .. FLOOR(LN(v_temp) / LN(32))
LOOP
v_sqlid :=
SUBSTR(‘0123456789abcdfghjkmnpqrstuvwxyz’,
FLOOR(MOD(v_temp / POWER(32, i), 32)) + 1,
1
)
 v_sqlid;
END LOOP;
RETURN LPAD(v_sqlid, 13, ‘0’);
END;
CREATE OR REPLACE FUNCTION sqlid_to_hash(p_sqlid IN VARCHAR2)
RETURN NUMBER
IS
— Author: Sean Stuber
— based on algorithm described by Tanel Poder here
— http://blog.tanelpoder.com/2009/02/22/sql_idisjustafancyrepresentationofhashvalue/
v_sqlid VARCHAR2(20) := TRIM(p_sqlid);
v_hash NUMBER;
BEGIN
— Convert base32 string into a number
SELECT SUM(
(INSTR(‘0123456789abcdfghjkmnpqrstuvwxyz’, SUBSTR(LOWER(v_sqlid), LEVEL, 1)) – 1)
* POWER(32, LENGTH(v_sqlid) – LEVEL))
INTO v_hash
FROM DUAL
CONNECT BY LEVEL <= LENGTH(v_sqlid);
— Hash value is only the low 32 bits
RETURN BITAND(v_hash, POWER(2, 32) – 1);
END;
/
Java:
private static String stmt_2_sqlid(String stmt) throws Exception
{
byte[] bytesOfMessage = (stmt).getBytes(“utf8”);
MessageDigest md = MessageDigest.getInstance(“MD5”);
byte[] b = md.digest(bytesOfMessage);
long val_msb = 0; // top 3 bytes //
long val_lsb = 0; // bottom 5 bytes //
// convert to long values //
val_msb = ((b[11] & 0xff) * 0x100 + (b[10] & 0xff)) * 0x100 + (b[9] & 0xff);
val_lsb = ((((b[8] & 0xff) * 0x100 + (b[15] & 0xff)) * 0x100 + (b[14] & 0xff)) * 0x100 + (b[13] & 0xff)) * 0x100 + (b[12] & 0xff);
// convert from log values to characters //
String alphabet = “0123456789abcdfghjkmnpqrstuvwxyz”;
String result = “”;
for (int i = 1; i <= 8; i++)
{
result = alphabet.substring((int) (val_lsb % 32), (int) (val_lsb % 32) + 1) + result;
val_lsb = val_lsb / 32;
}
for (int i = 9; i <= 13; i++)
{
result = alphabet.substring((int) (val_msb % 32), (int) (val_msb % 32) + 1) + result;
val_msb = val_msb / 32;
}
return result;
}