OiO.lk Community platform!

Oio.lk is an excellent forum for developers, providing a wide range of resources, discussions, and support for those in the developer community. Join oio.lk today to connect with like-minded professionals, share insights, and stay updated on the latest trends and technologies in the development field.
  You need to log in or register to access the solved answers to this problem.
  • You have reached the maximum number of guest views allowed
  • Please register below to remove this limitation

A python / Athena / Boto3 script referencing "all_agent_data_partitioned" database table fails when reading a Json array that is sometimes empty

  • Thread starter Thread starter user25137531
  • Start date Start date
U

user25137531

Guest
I am reading json encoded data from the AWS Connect all_agent_data_partitioned database table . I am specifically trying to read the currentagentsnapshot.contacts.contactid Array field.

When I read the currentagentsnapshot.contacts array my query runs successfully and I see populated and empty array entries as shown below

Code:
contacts  
[{channel=VOICE, 
  connectedtoagenttimestamp=xxxxxx,
  contactid=xxxxxx,
  initiationmethod=xxxxxx,
  queue={arn=arn:xxxxxx, name=xxxxxx}, 
  queuetimestamp=xxxxxx, 
  state=xxxxxx, 
  statestarttimestamp=xxxxxx}]
[]
[]

I have no issues reading other fields/arrays within this database table. I only have issues when reading the currentagentsnapshot.contacts.contactid array field from the all_agent_data_partitioned database table using the script below. The code is being run from Spyder.

I have included code below from the start of the script through the error. I am receiving the following error: InvalidRequestException: An error occurred (InvalidRequestException) when calling the GetQueryResults operation: Query did not finish successfully. Final query state: FAILED

I am guessing that the error has to do with looking up a field value in an empty ([]) array, but this is juts a guess.

I need to obtain contactid values from the array when the data exists.

I am new to Athena/BOTO3/Python/Json...but am making progress. Any help over this current barrier would be appreciated.

Code:
# Build an Athena Client using BOTO
import time
import boto3
AWS_ACCESS_KEY = "XXXXXXXXXXXX"
AWS_SECRET_KEY = "XXXXXXXXXXXX"
AWS_REGION = "ap-southeast-1"
athena_client = boto3.client(
    "athena",
    aws_access_key_id=AWS_ACCESS_KEY,
    aws_secret_access_key=AWS_SECRET_KEY, 
    region_name=AWS_REGION,
)


# Use the client to run a query
query_response = athena_client.start_query_execution(

    QueryString="""SELECT eventid
                      ,eventtype
                      ,eventtimestamp

                      ,currentagentsnapshot.agentstatus.name AS agentstatusname
                      ,... other fields to return                     
                      ,currentagentsnapshot.contacts.contactid AS contactid            
   
                     FROM all_agent_data_partitioned 
                     WHERE eventtype != 'HEART_BEAT'  
                           and Other filters 
        
        """,
    QueryExecutionContext={"Catalog": "AwsDataCatalog",
                          "Database": "asia_ml_connect_db"},
    ResultConfiguration={"OutputLocation": "XXXXXXXXXXXX"},
   
)
while True:
    try:
        # This function only loads the first 1000 rows
        athena_client.get_query_results(
            QueryExecutionId=query_response["QueryExecutionId"]
        )
        break
    except Exception as err:
        if "not yet finished" in str(err):
            time.sleep(1.0)
        else:
            raise err
<p>I am reading json encoded data from the AWS Connect all_agent_data_partitioned database table .
I am specifically trying to read the currentagentsnapshot.contacts.contactid Array field.</p>
<p>When I read the currentagentsnapshot.contacts array my query runs successfully and I see populated and empty array entries as shown below</p>
<pre><code>contacts
[{channel=VOICE,
connectedtoagenttimestamp=xxxxxx,
contactid=xxxxxx,
initiationmethod=xxxxxx,
queue={arn=arn:xxxxxx, name=xxxxxx},
queuetimestamp=xxxxxx,
state=xxxxxx,
statestarttimestamp=xxxxxx}]
[]
[]

</code></pre>
<p>I have no issues reading other fields/arrays within this database table.
I only have issues when reading the currentagentsnapshot.contacts.contactid array field from the all_agent_data_partitioned database table using the script below.
The code is being run from Spyder.</p>
<p>I have included code below from the start of the script through the error.
I am receiving the following error: InvalidRequestException: An error occurred (InvalidRequestException) when calling the GetQueryResults operation: Query did not finish successfully. Final query state: FAILED</p>
<p>I am guessing that the error has to do with looking up a field value in an empty ([]) array, but this is juts a guess.</p>
<p>I need to obtain contactid values from the array when the data exists.</p>
<p>I am new to Athena/BOTO3/Python/Json...but am making progress.
Any help over this current barrier would be appreciated.</p>
<pre><code># Build an Athena Client using BOTO
import time
import boto3
AWS_ACCESS_KEY = "XXXXXXXXXXXX"
AWS_SECRET_KEY = "XXXXXXXXXXXX"
AWS_REGION = "ap-southeast-1"
athena_client = boto3.client(
"athena",
aws_access_key_id=AWS_ACCESS_KEY,
aws_secret_access_key=AWS_SECRET_KEY,
region_name=AWS_REGION,
)


# Use the client to run a query
query_response = athena_client.start_query_execution(

QueryString="""SELECT eventid
,eventtype
,eventtimestamp

,currentagentsnapshot.agentstatus.name AS agentstatusname
,... other fields to return
,currentagentsnapshot.contacts.contactid AS contactid

FROM all_agent_data_partitioned
WHERE eventtype != 'HEART_BEAT'
and Other filters

""",
QueryExecutionContext={"Catalog": "AwsDataCatalog",
"Database": "asia_ml_connect_db"},
ResultConfiguration={"OutputLocation": "XXXXXXXXXXXX"},

)
while True:
try:
# This function only loads the first 1000 rows
athena_client.get_query_results(
QueryExecutionId=query_response["QueryExecutionId"]
)
break
except Exception as err:
if "not yet finished" in str(err):
time.sleep(1.0)
else:
raise err
</code></pre>
 

Latest posts

Top