In addition to SOAP, you can now access the Microsoft Web N-Gram data using simple GET/POST requests. The base URI has the following format:
http://web-ngram.research.microsoft.com/rest/lookup.svc/
A GET call on this URI will return a list of models supported in path-form which can be used in the various lookup methods:
http://web-ngram.research.microsoft.com/rest/lookup.svc/{catalog}/{version}/{order}/{operation}?{parameters}
For the newest version of the service (where new models will be made available), use the following format:
http://weblm.research.microsoft.com/weblm/rest.svc/{catalog}/{version}/{order}/{operation}?{parameters}
| Operation | Verb (and SOAP equivalent) | Parameters | Required? |
| jp |
GET = GetProbability POST = GetProbabilities |
u=<user_token> | Yes |
| p=<phrase> | GET only | ||
| format=<format> | No | ||
| cp |
GET = GetConditionalProbability POST = GetConditionalProbabilities |
u=<user_token> | Yes |
| p=<phrase> | GET only | ||
| format=<format> | No | ||
| gen | GET = Generate | u=<user_token> | Yes |
| p=<phrase> | Yes | ||
| n=<max tokens> | Yes | ||
| cookie=<cookie> | Yes, except first call | ||
| format=<format> | No |
In each case, the format can be one of the following: text, json, or xml. When no format is specified, text is assumed.
When using the batch-mode methods (i.e. with a POST request), each phrase should be separated by a newline character.
Python examples:
In these and other examples, you must provide your own value for the u parameter.
>>> import urllib
>>> import urllib2
>>> print urllib2.urlopen(urllib2.Request('http://web-ngram.research.microsoft.com/rest/lookup.svc/bing-body/jun09/3/jp?u=xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx','phrase one\nphrase two')).read()
-7.433633
-8.232555
>>> print urllib2.urlopen(urllib2.Request('http://web-ngram.research.microsoft.com/rest/lookup.svc/bing-body/jun09/3/jp?u=xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx&format=json','phrase one\nphrase two')).read()
[-7.433633,-8.232555]
>>> print urllib2.urlopen(urllib2.Request('http://web-ngram.research.microsoft.com/rest/lookup.svc/bing-body/jun09/3/jp?u=xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx&format=xml','phrase one\nphrase two')).read()
<ArrayOffloat xmlns="http://schemas.microsoft.com/2003/10/Serialization/Arrays" xmlns:i="http://www.w3.org/2001/XMLSchema-instance"><float>-7.433633</float><float>-8.232555</float></ArrayOffloat>
cUrl examples:
Note that & characters must be escaped with a caret (^) in a Microsoft Windows command prompt. Other shells have their own quirks.
C:\bin\curl>curl http://web-ngram.research.microsoft.com/rest/lookup.svc/
bing-anchor/jun09/1
bing-anchor/jun09/2
bing-anchor/jun09/3
bing-anchor/jun09/4
bing-anchor/jun09/4
bing-body/jun09/1
bing-body/jun09/2
bing-body/jun09/3
bing-title/jun09/1
bing-title/jun09/2
bing-title/jun09/3
bing-title/jun09/4
C:\bin\curl>curl http://web-ngram.research.microsoft.com/rest/lookup.svc/?format=json
["bing-anchor\/jun09\/1","bing-anchor\/jun09\/2","bing-anchor\/jun09\/3","bing-anchor\/jun09\/4","bing-body\/jun09\/1","bing-body\/jun09\/2","bing-body\/jun09\/3","bing-title\/jun09\/1","bing-title\/jun09\/2","bing-title\/jun09\/3","bing-title\/jun09\/4"]
C:\bin\curl>curl http://web-ngram.research.microsoft.com/rest/lookup.svc/bing-body/jun09/3/cp?u=xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx^&p=one+two+three^&format=text
-0.4996111
C:\bin\curl>curl http://web-ngram.research.microsoft.com/rest/lookup.svc/bing-body/jun09/3/cp?u=xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx^&p=one+two+three^&format=json
-0.49961108
C:\bin\curl>curl http://web-ngram.research.microsoft.com/rest/lookup.svc/bing-body/jun09/3/cp?u=xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx^&p=one+two+three^&format=xml
<float xmlns="http://schemas.microsoft.com/2003/10/Serialization/">-0.49961108</float>
C:\bin\curl>curl http://web-ngram.research.microsoft.com/rest/lookup.svc/bing-body/jun09/3/cp?u=xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx -d "post example"
-4.182756
C:\bin\curl>curl http://web-ngram.research.microsoft.com/rest/lookup.svc/bing-body/jun09/3/cp?u=xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx^&format=json -d "post example"
[-4.182756]
Last update: July 07, 2010
