Lets us Scan Complete DynamoDB Table: Python Boto3
Lets us Scan Complete DynamoDB Table: Python Boto3
How to Query full dynamo DB table with filters in place
In this blog, we will see examples of how to do a full query of a dynamo DB table. We will also see in more details with code, the concepts of applying filters and conditions to obtain only required records.
Why we need to filter the records?
The question is why do we need to scan the DB if we can fetch the records based on the partition key and sort key?
Case 1. When we need to list all the partition key and sort keys
Case2. When we need to take a snapshot of the data and back it up or replicate it up in the AWS S3 as S3 is cost effective and comes with flavor of Intelligent Tiering.
warning: SCAN is a costly operation as it has to scan through the complete table so where ever possible , use query operation instead scan.
Please see my Youtube video to see the explanation of the 2 scan operation:
Video 1 while applying a single Filter on the Table
Please create the table as required while watching the video above,
The program is as below:
****Code Starts Here******************
import boto3 | |
import botocore | |
from boto3.dynamodb.conditions import Attr | |
client = boto3.resource('dynamodb',aws_access_key_id='AKIAVLFFKFZS2F2SAEXQ', aws_secret_access_key='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', region_name='us-east-1') | |
list(client.tables.all()) | |
def scan_database(table, scan_kwargs=None): | |
if scan_kwargs is None: | |
scan_kwargs = {} | |
dynamodb = boto3.resource('dynamodb',aws_access_key_id='AKIAVLFFKFZS2F2SAEXQ', aws_secret_access_key='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', region_name='us-east-1') | |
table = dynamodb.Table(table) | |
complete = False | |
records = [] | |
while not complete: | |
try: | |
response = table.scan(**scan_kwargs) | |
except botocore.exceptions.ClientError as error: | |
raise Exception('Error quering DB: {}'.format(error)) | |
records.extend(response.get('Items', [])) | |
next_key = response.get('LastEvaluatedKey') | |
scan_kwargs['ExclusiveStartKey'] = next_key | |
complete = True if next_key is None else False | |
return records | |
if __name__ == '__main__': | |
table_name = "employeeT" | |
[print(i) for i in scan_database(table_name)] | |
if __name__ == '__main__': | |
table_name = "employeeT" | |
kwargs = { | |
'FilterExpression': Attr("id").begins_with('ppm') | |
} | |
[print(i) for i in scan_database(table_name, kwargs)] | |
dynamoDBClient = boto3.client('dynamodb',aws_access_key_id='AKIAVLFFKFZS2F2SAEXQ', aws_secret_access_key='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', region_name='us-east-1') | |
table = dynamoDBClient.describe_table(TableName='employeeT') | |
print(table) | |
response = dynamoDBClient.describe_table(TableName='employeeT') | |
print(response['Table']['ItemCount']) | |
*****Code Finish Here******************
Please Use Git Hub Link below:
Video 2 while applying a Multiple Filter on the Table
Please create the table as required while watching the video above,
import boto3 | |
import botocore | |
from boto3.dynamodb.conditions import Attr | |
def scan_database(table, scan_kwargs=None): | |
if scan_kwargs is None: | |
scan_kwargs = {} | |
dynamodb = boto3.resource('dynamodb',aws_access_key_id='AKIAVLFFKFZSR2PIHS2L', aws_secret_access_key='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', region_name='us-east-1') | |
table = dynamodb.Table(table) | |
complete = False | |
records = [] | |
while not complete: | |
try: | |
response = table.scan(**scan_kwargs) | |
except botocore.exceptions.ClientError as error: | |
raise Exception('Error quering DB: {}'.format(error)) | |
records.extend(response.get('Items', [])) | |
next_key = response.get('LastEvaluatedKey') | |
scan_kwargs['ExclusiveStartKey'] = next_key | |
complete = True if next_key is None else False | |
return records | |
if __name__ == '__main__': | |
table_name = "employee" | |
[print(i) for i in scan_database(table_name)] | |
if __name__ == '__main__': | |
table_name = "employee" | |
kwargs = { | |
'FilterExpression': Attr("salary").eq(35007) & | |
Attr("id").begins_with('ppd_') & | |
Attr("dept").ne('Hr') | |
} | |
[print(i) for i in scan_database(table_name, kwargs)] | |
Output: | |
{'dept': 'Fin', 'id': 'ppd_927', 'name': '11forfivesix', 'salary': Decimal('35007')} | |
{'dept': 'Fin', 'id': 'ppd_626', 'name': '9forfivesix', 'salary': Decimal('35007')} |
Please Use Git Hub Link below:
AWS DynamoDB SCAN Multiple Filter
Please visit the DataPundit youtube channel for more AWS , Abinitio, UNIX , SQL , DWH videos. https://www.youtube.com/@datapundit
by DataPundit https://www.youtube.com/@datapundit
Good Read here too
ReplyDelete