Monday, February 27, 2023

Matillion Configuration on AWS EC2 Server with Snowflake Database

 Matillion Configuration on AWS EC2 Server with Snowflake Database


Creating Matillion Instance for Snowflake in AWS:

Steps to be followed:

1.  Create AWS account 

2.  Create Snowflake account. 

3. Create Matillion Hub Account.

4. Launching Matillion Instance for Snowflake in EC2


AWS Account:

1.       1. Navigate to AWS Console https://console.aws.amazon.com/

1. 


Click on “Create a new AWS account”


2. Enter Your Account Name, Email, Password Details to Signup



Click on Continue

3. 3. Enter Personal Details


4.    Enter Your Debit/Credit Card details

Click on Continue

2.    5. Select on Text/Call Method for Mobile Verification (Call is Preferred Option for Instance Verification)


Snowflake Account:

1. Navigate to https://www.snowflake.com/ and Click on Start For Free





1.       2. Fill Details First Name, Last Name, Email, Organization name &country

1        3. Select Edition and Cloud Provider




1.       4. You will receive a verification email to your email. Click on Activate 



Launching Matillion Instance for Snowflake in EC2:

1.       Navigate to https://www.matillion.com/products/etl-software/ Try Matillion ETL Free Trail for 14 Days.




Click on “Start your Trail” -> It will Redirects to https://billing.matillion.com/setup/cloudProvider

1.       Select “Amazon Web Services” as your Cloud Provider




Select “Snowflake” as your Cloud Data Platform/Cloud DataWarehouse


Select “Amazon Machine Image(AMI)” as your Delivery Mechanism








1.       It will Redirects to https://signin.aws.amazon.com/signin?redirect_uri=XXXXXX

Enter your AWS Credentials to signin
Select Any Matillion Instance as shown in below and Click “Launch” 


Select Instance type




Click on “Next: Configure Instance Details”

Click on “Next: Add Storage” 


1.       Click Add New volume(Optional)




Click on “Next: Add Tags”




Enter Key Values





Click on “Next: Configure Security Groups”

1.       Add Rule for “HTTP” in Security Groups

Click on “Review and Launch”

1.       Click on Launch






 Click on Open Address Using IPv4 address or IPv4 DNS

2.       Default Credentials for Matillion ETL Username: ec2-user Password:<Instance ID of EC2> 


 


 





Wednesday, February 22, 2023

Matillion: Automation - Catalina File Archival

 Matillion: Automation - Catalina file Archival


For every event happening in Matillion application corresponding logs gets generated in Catalina.out file. As the file size grows more than a permissible limit the performance of the server goes down and may even hang the Matillion process. So to avoid this, Matillion has a built in auto archival process that archives the Catalina.out file. It is scheduled every Sunday by default.

The file size can be changed from 50MB to desired file size in the /etc/logrotate.d/tomcat8 file. Still if the file size goes beyond desired file size within a week, this will not work. Also there is a high possibility of server getting hanged.

https://documentation.matillion.com/docs/en/6052919

/var/log/tomcat8/catalina.out {copytruncate

daily

rotate 7

compress

missingok

size 50M

}

Process Highlight:

To avoid this, a script has been written to check the Catalina.out file size every three hours, if the size goes than the desired file size, the script will automatically archives the file else will wait for the next schedule.  The script also cleans the archival folder by deleting the history files older than the given days.

Benefits:

v  Avoids Jobs does not get into hung state.

v  Keeps web services always alive

v  No manual intervention

Flow chart:




Script Details

·       Script name: autolog_archive.sh

·       Script Path: /home/centos

·       Script Owner: Root

·       Script Permission: Read, Write, Execute

·       Script Parameters:

o   file_size_limit_mb :  the desired file size to get archived

o   history_days_limit= the desired number of days the archived files to be kept in server.

·       Supported files: The script generates below files.

o   autolog_archivelog.txt – It is a log file in same home directory, It contains the action logs performed by script as text with date and time. The time format used in log is UST. The file gets updated during every execution of the script.







Script:

#!/bin/bash
file_size_limit_mb=500
history_days_limit=10

log_file_size=`ls -lh /var/log/tomcat8/catalina.out|cut -d' ' -f5|grep -o -E "[0-9]+"`
echo $log_file_size >> /home/centos/autolog_archivelog.txt

if [ $log_file_size -gt $file_size_limit_mb ]
then
filename=catalina`date +%Y%m%d%H%M%S`.out
cp /var/log/tomcat8/catalina.out /var/log/tomcat8/archival/catalina`date +%Y%m%d%H%M%S`.out
gzip /var/log/tomcat8/archival/$filename

sudo logrotate -f /etc/logrotate.d/tomcatrotate
echo 'log clear cmd executed' >> /home/centos/autolog_archivelog.txt
find /var/log/tomcat8/archival -name "*.gz" -type f -mtime $history_days_limit -delete
echo 'deleted files older than given days' >> /home/centos/autolog_archivelog.txt
echo `date` >> /home/centos/autolog_archivelog.txt
echo ------------------------------- >> /home/centos/autolog_archivelog.txt

else

echo 'not archived' >> /home/centos/autolog_archivelog.txt
echo `date` >> /home/centos/autolog_archivelog.txt
echo ------------------------------- >> /home/centos/autolog_archivelog.txt
fi

#crontab entries are in below file
#/etc/crontab

Matillion : Automation - Restart matillion Services Weekly

 Matillion : Automate - Restart matillion Services Weekly 

Non-PRD  Matillion Servers are heavily used by developers daily and there will be many process / connections opened on server level, due to this it will not allow other users to login into system or server performance will be degraded. To resolve this issue admin team has developed an automated script which will recycle Matillion non-prod services without any manual intervention and sends email alerts on success / failure of service. On recycle of services all hung process or unused process will be killed. This script execution flushes out temporary Memory on Linux servers used.




  1. The script gets executed every time from a external scheduler or cron tab.
  2. The script initially checks the tomcat service run state.
  3. If the tomcat services not running by default, then restart activity is not performed and a failure mail is sent.
  4. If the tomcat service runs then the script executes stop command and stops the tomcat service.
  5. After a minute the script checks whether the tomcat service got stopped.
  6. If the tomcat services are stopped then the script executes start command. It starts the tomcat service and confirms the tomcat service start, then sends a success notification email to recipients added.
  7. If the tomcat services not stopped then the script executes stop command. It tries to stop once again and proceed on step 4.
  8. If the tomcat services are not stopping after 3 consecutive attempt then a failure notification mail is sent.
  9. All the activities are captured in Restart_log.txt file.

Benefits:

v  Kills unwanted process and refresh server CPU more  to Matillion.

v  Avoids Jobs does not get into hung state.

v  Keeps web services always alive

v  Kills the existing long running jobs

v  No manual intervention

v  Alerts the admin team by sending notifications on success/failures

 Process Highlight:

v  If the webserver is in stop state as per any adhoc request, then the script never executes stop command but sends an alert. A failure mail notification. 

v  The script tries three attempts to stop the webserver and sends 3 failure attempt mails.

v  The script writes its actions as a log for easy analysis.

Log Files: Every action executed in script is written in restart_log.txt log file. Timestamp is written along with log for tracing. The time            format used in log is UST. The file gets updated during every execution of the script.

Email Notifications: The script sends 3 types of mails as per the scenario

       Restart Success Mail

       Restart Failure Mail

       Already in stop state Mail

Script Details

·       Script name: restart_sh.sh

·       Script Path: /home/centos

·       Script Owner: Root

·       Script Permission: Read, Write, Execute

·       Supported files: The script generates below files.

o   Status.txt – This file is used to check the status of tomcat service within the script. This file gets newly created during every execution in the same path.

Restart_log.txt – It is a log file, It contains the actions performed by script as text with date and time. The time format used in log is UST. The file gets updated during every execution of the script

Script:

#!/bin/bash
recipients="ambarish@abcd.com"
cd /home/admin_usr1/
systemctl status tomcat8.service > /home/admin_usr1/status.txt
run_flag=`grep 'active (running)' /home/admin_usr1/status.txt|wc -l`

if [ $run_flag -eq 1 ]
then
echo "`date "+%D %T"` :webserver is active">> restart_log.txt
else
echo "`date "+%D %T"` :webserver is inactive">> restart_log.txt
fi

attempt=1
while [ $attempt -le 3 ]
do
echo "`date "+%D %T"` :stop webserver attempt: $attempt">> restart_log.txt
if [ $run_flag -eq 1 ]
then
sudo systemctl stop tomcat8.service
echo "`date "+%D %T"` :stop command executed"
echo "`date "+%D %T"` :stop webserver cmd executed">> restart_log.txt
echo "`date "+%D %T"` :waiting for webserver to stop">> restart_log.txt
#echo "`date "+%D %T"` :waiting for webserver to stop"
sleep 50s
systemctl status tomcat8.service > /home/admin_usr1/status.txt
run_flag=`grep 'active (running)' /home/admin_usr1/status.txt|wc -l`

if [ $run_flag -eq 0 ]
then
sudo systemctl start tomcat8.service
echo "`date "+%D %T"` :start command executed"
echo "`date "+%D %T"` :start tomcat cmd executed, attempt $attempt successful">> restart_log.txt

sleep 50s
systemctl status tomcat8.service > /home/admin_usr1/status.txt
run_flag=`grep 'active (running)' /home/admin_usr1/status.txt|wc -l`

if [ $run_flag -eq 1 ]
then
#echo "`date "+%D %T"` :start tomcat cmd executed, attempt $attempt successful"
echo "`date "+%D %T"` :service is in running state">> restart_log.txt
echo "subject: QA Matillion Services Restarted Successfully"| /usr/sbin/sendmail -f ambarish@abcd.com -t "$recipients"
echo "`date "+%D %T"` :success mail sent">> restart_log.txt
fi

break
else
echo "`date "+%D %T"` :websever not responding, attempt $attempt failed">> restart_log.txt
echo "subject: QA Matillion Services Restart Attemp $attempt failed"| /usr/sbin/sendmail -f ambarish@abcd.com -t "$recipients"
echo "`date "+%D %T"` :failure mail sent">> restart_log.txt

fi
else
echo "`date "+%D %T"` :existing webserver was already in stopped state, restart aborted">> restart_log.txt
#echo "`date "+%D %T"` :existing webserver was already in stopped state, restart aborted"
echo "subject: QA Matillion Services Already In Stop State, Restart Attemp $attempt failed"| /usr/sbin/sendmail -f ambarish@abcd.com -t "$recipients"

echo "`date "+%D %T"` :failure mail sent">> restart_log.txt
break
fi

attempt=`expr $attempt + 1`
done
echo "`date "+%D %T"` ------------------end of execution-----------------">> restart_log.txt

#to edit crontab
#crontab -e
#To check the running logs
#tail -f /home/centos/restart_log.txt

Matillion Upgrade Steps from 1.64.7 to 1.65.8

 Matillion Upgrade Steps from 1.64.7 to 1.65.8

Step1: check the current version

Go to   Helpà about  











Step 2:  take a snapshot backup of existing server in AWS ec2. Then

goto Admin à matillioni ETL updates

Click on check for updates and follow the steps as guided.














































 Finish the wizard by clicking OK.











Matillion Upgrade Steps from 1.54 to 1.59

 

Step 1: Open Server and get the root access

Sudo su –

 Step2: execute below command to show the current version of Matillion installed

# yum list installed Matillion*


NOTE: There are 2 files shown. These two files need to be updated for upgrading Matillion.

Step 3: Refresh the package present in local yum repository.

3.1 Execute the below command to check the existing/predownloaded package available in local library.  It shows both files.

# yum list available matillion-emerald --showduplicates

# yum list available matillion-emerald-cdata --showduplicates


3.2 As the existing local library does not contain the latest package, the library needs to get cleared and re downloaded.

# yum clean all




Now this command will download the latest package update to server.

# yum list available matillion-emerald-cdata --showduplicates

# yum list available matillion-emerald --showduplicates

























Step 4: Installing the downloaded package 1.59. This has to be in same order.

4.1 Installing the 1st package Noarch

# yum update matillion-emerald-1.59.11-1

Type y and press ENTER

4.2 Installing the 2nd Package C-data

# yum update matillion-emerald-cdata-1.59.11-2

Type y and press ENTER














4.3 Check the updated version

# yum list installed  matillion*








Step 5: Check for tomcat services are up and running

# service tomcat8 status

Step 6: Login into Matillion site and check for the updated version.