Step 4 - JSON

What is the json file format?

JSON stands for Javascript Object Notation and is pronounced “jason”. It is a very similar structure to python dictionaries and lists, with only a few exceptions. It is a very common format for computer programs to exchange information in using Application Programming Interfaces (APIs).

When we convert Python to JSON there is some conversion that takes place. The mapping is shown below:

Python JSON
dict object
list,tuple array
str string
int, float number
True true
False false
None null

For this reason, we need to use the json package to perform the conversion. The json package is a standard python package so there is no need to install it with pip.

json.loads() & json.dumps()

Before we use an external file it is worth spending time learning about json.loads() and json.dumps(). These two methods use JSON strings, denoted by the ’s’. When learning to manipulate JSON it is easy to get confused between json.loads() and json.load() or json.dumps() and json.dump().

Here is the easy way to know which to use.

  • json.load() & json.dump() - Use to input and output JSON from files and into files.
  • json.loads() & json.dumps() - Use to input and outputting JSON from strings and into strings.

Create a new file lab_5_step_4_json_input.py and add the text below:

import json

# This uses a json string as an input 

json_string = """
{
    "Input":[
        {
        "Text":"I am learning to code in AWS",
        "SourceLanguageCode":"en",
        "TargetLanguageCode":"fr",
        "Required": true
        }
    ]
}
"""

def main():
    json_input = json.loads(json_string)
    print(json_input)

if __name__=="__main__":
    main()

The parameters should by now look familiar as our inputs to the Amazon Translate service, with the addition of an extra parameter, "Required": true. This is just for illustration, and we will remove it in a moment.

  • To run the program, enter the following command in the terminal:

    python lab_5_step_4_json_input.py

  • It should return:

{'Input': [{'Text': 'I am learning to code in AWS', 'SourceLanguageCode': 'en', 'TargetLanguageCode': 'fr', 'Required': True}]}

If you look at the key:value pair for “Required” you can see that python has mapped the original true to a valid value of True. This illustrates how the mapping between JSON and python works.

This format is not easy to read, but we can get python to provide the same formatting of indentation as our original string that we passed in as the variable json_string. To do this we are going to use json.dumps() but with an added parameter to format the string with indentation.

Modify your code as below to use json.dumps(json_input, indent=2)

import json

# This uses a json string as an input 

json_string = """
{
    "Input":[
        {
        "Text":"I am learning to code in AWS",
        "SourceLanguageCode":"en",
        "TargetLanguageCode":"fr",
        "Required": true
        }
    ]
}
"""

def main():
    json_input = json.loads(json_string)
    indented_format = json.dumps(json_input, indent=2)
    print(indented_format)

if __name__=="__main__":
    main()

When you run this you will see that it has returned our original string in a format that is easier to read.

What We Did

  • We defined a variable with a string.
  • We used the python json.loads() to load the string and convert it into JSON.
  • We then used python json.dumps() to convert the string back to valid python data types.
  • We added some formatting to make it easier to read.

What Python Did

  • Python has used a string assigned to a variable and used the json.loads() method to convert it into valid JSON. It converted the "Required" : true key pair into valid JSON of "Required" : True
  • Python has then converted the valid JSON back into valid python data types using the json.dumps() method. We added additional formatting using indent = 2 to make it easier to read.

Learning to navigate a JSON structure to use the information you need is probably one of the most fundamental lessons you will learn when using python with AWS. It can be confusing, but we will break it down.

If you look at the structure above you can see that it comprises a dictionary with a key of “Input” and a value of a list containing another dictionary. This is called nesting. It is very common to have dictionaries which contain lists, which contain dictionaries. This structure can keep being nested, so you need to learn how to navigate it.

  • Modify your code as below:
import json

# This uses a json string as an input 

json_string = """
{
    "Input":[
        {
        "Text":"I am learning to code in AWS",
        "SourceLanguageCode":"en",
        "TargetLanguageCode":"fr",
        "Required": true
        }
    ]
}
"""
# Modify below this line
def main():
    json_input = json.loads(json_string)
    text = json_input['Input'][0]['Text']
    source_language_code = json_input['Input'][0]['SourceLanguageCode']
    target_language_code = json_input['Input'][0]['TargetLanguageCode']
    print(text, source_language_code, target_language_code)

if __name__=="__main__":
    main()

When you ran this, it should have returned I am learning to code in AWS en fr.

What Python did

  • What python is doing is navigating the JSON structure.
    • First, because the JSON is in the variable json_input we used this as our first reference.
    • Next, the first dictionary key is "Input" so this is placed in [] as [‘Input’].
    • Next, in our structure is a list. A list uses an index, and the index starts at 0 for our first item. So to get the first item we use [0].
    • Next we want to get the values for "Text","SourceLanguageCode" and "TargetLanguageCode".

You should now have a good idea about how python is navigating the JSON structure so you can get to the information you want in the file. You may have noticed that the index value of [0] is not very flexible. What happens if the list contains more than one item in the list? In a later lesson, we are going to learn how to iterate over lists so that we can use repetition to make our code more flexible and scalable.

json.load() & json.dump()

Earlier we used json.loads() and json.dumps() to manipulate JSON strings. Create a file with some JSON data that we will use for the Amazon Translate service to use.

  • Create a new file called translate_input.json.

  • Paste the following text into the file.

{
    "Input":[
        {
        "Text":"I am learning to code in AWS",
        "SourceLanguageCode":"en",
        "TargetLanguageCode":"fr"
        }
    ]
}

Modify lab_5_step_4_json_input.py.

  • Type or paste the following into the file:
# Standard Imports
import argparse
import json

# 3rd Party Imports
import boto3

# Arguments
parser = argparse.ArgumentParser(description="Provides translation  between one source language and another of the same set of languages.")
parser.add_argument(
    '--file',
    dest='filename',
    help="The path to the input file. The file should be valid json",
    required=True)

args = parser.parse_args()

# Functions
def open_input():
    with open(args.filename) as file_object:
        contents = json.load(file_object)
        return contents['Input'][0]

def translate_text(**kwargs): 
    client = boto3.client('translate')
    response = client.translate_text(**kwargs)
    print(response) 

# Main Function - use to call other functions
def main():
    kwargs = open_input()
    translate_text(**kwargs)

if __name__ == "__main__":
    main()
  • To run the program, enter the following command in the terminal:

    python lab_5_step_4_json_input.py --file translate_input.json

If you have completed previous labs there should be nothing new here. But we will now break down what is going on.

  • At the top, we define our imports. These are grouped with built-in packages at the top and then installed packages.

  • Next we use argparse to give us some command line interface inputs, we have defined a single argument --file.

  • Next, we define three functions.

    • The first opens the file using the with open() and makes it into a python object called file_object. We then use json.load() and navigate the structure to get the information we require, we then return this value.

    • The second function is our standard Amazon Translate function which accepts an arbitrary number of keyword arguments.

    • Our third function is our main function. This function is used to call the other functions in the order specified. This uses the variable kwargs to call the open_input() function which returns the values from the function. This then calls the translate_text() function and uses the kwargs variable to provide the arguments as inputs to the translate_text() function.