Using the Watson API from Node

The IBM Watson API is part of IBM Bluemix, which appears to be be a fairly similar offerings to AWS.

The “Watson” part of the offering has several analysis APIs, including image recognition, text analysis, speech to text, translation, etc.

In this example, I’ll show how to access the “Alchemy” API from Node. Alchemy gives you a wide range of basic natural language processing – in this case I’ll test out the sentiment analysis API (https://gateway-a.watsonplatform.net/calls/text/TextGetTextSentiment). If you’re looking to use this, it gives you 1,000 free API calls per day, and then is metered. Pricing per API call varies depending on which call it is, and how difficult or commoditized the operation is. The APIs that are in Beta appear to be generally more free as well.

When you set up an account, you can activate one of these APIs on your account through the portal:

Note that if you use the free version they require you to include sourcing in your application:

Usage Restrictions

Usage of AlchemyAPI should abide by the restrictions of your respective API tier. Approved academic users are permitted to make a maximum of 30,000 API calls each day (or 1,000 daily calls for commercial use). More daily API calls (up to 200,000,000 daily) are available through an AlchemyAPI Subscription Plan.

Users of the AlchemyAPI Free service tier must provide proper attribution within their website or application:

Provide a clickable hyperlink to www.alchemyapi.com with the text "Text Analysis by AlchemyAPI" within your website or application; and

Provide attribution to AlchemyAPI within any published works that are based on or mention AlchemyAPI, or content generated by AlchemyAPI, including but not limited to research papers and journal articles.

Provide attribution to AlchemyAPI within all web pages or documents where AlchemyAPI content and/or API results are used or displayed.

Other usage restrictions are listed in the AlchemyAPI Terms of Use policy.

When you call this API, you just need an API key, and then a payload of request parameters:

let apiCall = {
  apikey: apikey,
  text: text,
  outputMode: 'json',
  showSourceMode: 0
};

post('https://gateway-a.watsonplatform.net' + baseUrl + endpoint,
  {form: apiCall},
    (err, response, body) => {
      if (!!err) {
        console.log(err);  
      }
          
      writeFile(
        filename,
        body
      );             
    }
);

I like to write each API call out to it’s own JSON file – this way I don’t risk calling the same API call multiple times and wasting resources, so around the above code I leave a check to make sure that the API output isn’t already downloaded:

let filename: string = 
  "json/watson/" + field + "_" + endpoint + "_" + id + ".json";

if (existsSync(filename)) {
  console.log("Skipping " + filename);
} else {
   console.log("Analyzing " + filename);

   ... 
}

When I do this, I’m starting from a folder where I have a series of JSON files I want to analyse. To do this, you need to read the folder:

readdir("json/1",
  (err, files) => {
    process(files, 1, () => {console.log("finished!")});
  });

Then the “process” function is recursive, or else you get a “too many open files” error.

function process(files, i, finish) {
  let filename = files[i];
  let id = (filename.split("."))[0];
  
  readFile(
    "json/1/" + filename,
    (err, content) => {   
      fields.map(
        (field) => {
          apiCalls.map(
            (endpoint) => 
              Try( () => {       
                if (!!err) {               
                  console.log(err);
                  return;
                }
                
                let json: string = JSON.parse(content + '');
                let text: string = json[field];
                
                analyze(field, endpoint, id, text);
                
                if (i < files.length) {
                  process(files, i + 1, finish);
                } else {
                  finish();
                }
              }, "Parsing file " + id) 
          )
        })
    });
}

When this runs successfully, you get output that looks like the following example – this is using the transcript from a university lecture. I did find it a little disappointing that this only gives you a single data value, where the Watson demo gives you hundreds, so it appears that you pay for every data point that you get out.

{
    "status": "OK",
    "usage": "By accessing AlchemyAPI or using information generated by AlchemyAPI, you are agreeing to be bound by the AlchemyAPI Terms of Use: http://www.alchemyapi.com/company/terms.html",
    "totalTransactions": "1",
    "language": "english",
    "docSentiment": {
        "mixed": "1",
        "score": "0.214952",
        "type": "positive"
    }
}

If you run this too many times, you’ll get an API call that looks like this:

{
    "status": "ERROR",
    "statusInfo": "daily-transaction-limit-exceeded"
}

This is a bit of a pain, because you can get a lot of these (the sentiment API is quite fast, so it takes less than a minute to run out of calls).

When running this, I found it was helpful to inject a logging utility to print out HTTP calls – this helps in the event you screw up something in the Node library to do external requests:

requestLogger(require('https'));