Microsoft Web N-Gram REST interface

In addition to SOAP, you can now access the Microsoft Web N-Gram data using simple GET/POST requests. The base URI has the following format:

http://web-ngram.research.microsoft.com/rest/lookup.svc/

A GET call on this URI will return a list of models supported in path-form which can be used in the various lookup methods:

http://web-ngram.research.microsoft.com/rest/lookup.svc/{catalog}/{version}/{order}/{operation}?{parameters}

For the newest version of the service (where new models will be made available), use the following format:

http://weblm.research.microsoft.com/weblm/rest.svc/{catalog}/{version}/{order}/{operation}?{parameters}

Operation Verb (and SOAP equivalent) Parameters Required?
jp GET = GetProbability
POST = GetProbabilities
u=<user_token> Yes
p=<phrase> GET only
format=<format> No
cp GET = GetConditionalProbability
POST = GetConditionalProbabilities
u=<user_token> Yes
p=<phrase> GET only
format=<format> No
gen GET = Generate u=<user_token> Yes
p=<phrase> Yes
n=<max tokens> Yes
cookie=<cookie> Yes, except first call
format=<format> No

In each case, the format can be one of the following: text, json, or xml. When no format is specified, text is assumed.

When using the batch-mode methods (i.e. with a POST request), each phrase should be separated by a newline character.

Python examples:

In these and other examples, you must provide your own value for the u parameter.

>>> import urllib
>>> import urllib2
>>> print urllib2.urlopen(urllib2.Request('http://web-ngram.research.microsoft.com/rest/lookup.svc/bing-body/jun09/3/jp?u=xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx','phrase one\nphrase two')).read()
-7.433633
-8.232555
>>> print urllib2.urlopen(urllib2.Request('http://web-ngram.research.microsoft.com/rest/lookup.svc/bing-body/jun09/3/jp?u=xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx&format=json','phrase one\nphrase two')).read()
[-7.433633,-8.232555]
>>> print urllib2.urlopen(urllib2.Request('http://web-ngram.research.microsoft.com/rest/lookup.svc/bing-body/jun09/3/jp?u=xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx&format=xml','phrase one\nphrase two')).read()
<ArrayOffloat xmlns="http://schemas.microsoft.com/2003/10/Serialization/Arrays" xmlns:i="http://www.w3.org/2001/XMLSchema-instance"><float>-7.433633</float><float>-8.232555</float></ArrayOffloat>

cUrl examples:

Note that & characters must be escaped with a caret (^) in a Microsoft Windows command prompt. Other shells have their own quirks.

C:\bin\curl>curl http://web-ngram.research.microsoft.com/rest/lookup.svc/
bing-anchor/jun09/1
bing-anchor/jun09/2
bing-anchor/jun09/3
bing-anchor/jun09/4
bing-anchor/jun09/4
bing-body/jun09/1
bing-body/jun09/2
bing-body/jun09/3
bing-title/jun09/1
bing-title/jun09/2
bing-title/jun09/3
bing-title/jun09/4
C:\bin\curl>curl http://web-ngram.research.microsoft.com/rest/lookup.svc/?format=json
["bing-anchor\/jun09\/1","bing-anchor\/jun09\/2","bing-anchor\/jun09\/3","bing-anchor\/jun09\/4","bing-body\/jun09\/1","bing-body\/jun09\/2","bing-body\/jun09\/3","bing-title\/jun09\/1","bing-title\/jun09\/2","bing-title\/jun09\/3","bing-title\/jun09\/4"]
C:\bin\curl>curl http://web-ngram.research.microsoft.com/rest/lookup.svc/bing-body/jun09/3/cp?u=xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx^&p=one+two+three^&format=text
-0.4996111
C:\bin\curl>curl http://web-ngram.research.microsoft.com/rest/lookup.svc/bing-body/jun09/3/cp?u=xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx^&p=one+two+three^&format=json
-0.49961108
C:\bin\curl>curl http://web-ngram.research.microsoft.com/rest/lookup.svc/bing-body/jun09/3/cp?u=xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx^&p=one+two+three^&format=xml
<float xmlns="http://schemas.microsoft.com/2003/10/Serialization/">-0.49961108</float>
C:\bin\curl>curl http://web-ngram.research.microsoft.com/rest/lookup.svc/bing-body/jun09/3/cp?u=xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx -d "post example"
-4.182756
C:\bin\curl>curl http://web-ngram.research.microsoft.com/rest/lookup.svc/bing-body/jun09/3/cp?u=xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx^&format=json -d "post example"
[-4.182756]

Last update: July 07, 2010