Skip to content

rodcsant/dio-live-aws-bigdata-2

 
 

Repository files navigation

Instruções para atividade prática

Repositório para a Live de 24/06/2021

Roteiro:

  1. Criar um Stream Delivery com o AWS Kinesis Firehose

  2. Configurar instance no AWS EC2

  3. Gerar logs de processamento de dados com Python

  4. Armazenar logs no AWS S3

  5. Manipular dados no AWS Glue Data Brew (análise gráfica)

Kinesis Delivery Stream

  • AWS Console -> Kinesis -> Create Firehose Delivery Stream "StreamName" -> Direct PUT -> Next -> Choose Destination -> Create S3 Bucket “covid-vaccines-logs-diolive” -> Configure settings -> buffer size 5mb -> buffer interval 60s -> IAM Role -> create new role -> Review and create

EC2

  • AWS Console -> EC2 -> Amazon Linux 2 AMI -> t2micro -> review and launch -> create new key pair -> download .pem file -> download putty -> puttygen -> load.pem file -> save .ppk file -> putty copy dns -> paste hostname -> SSH -> auth -> load ppk file -> login “ec2-user”

    • sudo yum install -y aws-kinesis-agent
    • sudo yum install -y git
    • git clone https://github.com/cassianobrexbit/dio-live-aws-bigdata-2.git
    • unzip Dataset.zip
    • chmod a+x LogGenerator.py
    • nano LogGenerator.py
    • less country_vaccinations.csv
    • sudo mkdir /var/log/diolive
    • cd /etc/aws-kinesis
    • sudo nano agent.json
    • Copiar conteúdo do arquivo agent.json
    • agent.json -> "kinesis.endpoint": "kinesis..amazonaws.com"
  • AWS Console -> EC2 -> Instances -> Select Instance -> Security -> Modify IAM Role -> Create New Role -> EC2 -> Administrator Access -> rolename “ec2-admin-role” -> save

    • sudo service aws-kinesis-agent start
    • sudo chkconfig aws-kinesis-agent on (start with instance)
    • _cd ~_
    • sudo ./LogGenerator.py 500000
    • tail -f /var/log/aws-kinesis-agent/aws-kinesis-agent.log

S3

  • AWS Console -> S3 -> select bucket -> selecionar arquivo e download

SSH EC2

  • sudo service aws-kinesis-agent restart
  • sudo ./LogGenerator.py
  • tail -f /var/log/aws-kinesis-agent/aws-kinesis-agent.log
  • AWS Console -> Kinesis Streams -> select stream -> monitoring

AWS Glue

  • AWS Console -> glue databrew -> create new project -> create new role -> create project
  • Create dataset -> S3 -> formato CSV