Merge PR #338 #90DaysOfDevOps #Databases from @dbafromthecold

90DaysOfDevOps-Databases
This commit is contained in:
Michael Cade 2023-03-05 10:00:09 +00:00 committed by GitHub
commit 2ef7a393e4
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
34 changed files with 1402 additions and 7 deletions

14
2023.md
View File

@ -126,13 +126,13 @@ Or contact us via Twitter, my handle is [@MichaelCade1](https://twitter.com/Mich
### Databases ### Databases
- [] 🛢 63 > [](2023/day63.md) - [] 🛢 63 > [An introduction to databases](2023/day63.md)
- [] 🛢 64 > [](2023/day64.md) - [] 🛢 64 > [Querying data in databases](2023/day64.md)
- [] 🛢 65 > [](2023/day65.md) - [] 🛢 65 > [Backing up and restoring databases](2023/day65.md)
- [] 🛢 66 > [](2023/day66.md) - [] 🛢 66 > [High availability and disaster recovery](2023/day66.md)
- [] 🛢 67 > [](2023/day67.md) - [] 🛢 67 > [Performance tuning](2023/day67.md)
- [] 🛢 68 > [](2023/day68.md) - [] 🛢 68 > [Database security](2023/day68.md)
- [] 🛢 69 > [](2023/day69.md) - [] 🛢 69 > [Monitoring and troubleshooting database issues](2023/day69.md)
### Serverless ### Serverless

View File

@ -0,0 +1,134 @@
# An introduction to databases
Welcome to the 90DaysOfDevOps database series. Over the next seven days well be talking about all things database related!
The aim of this series of blog posts is to provide an introduction to databases and their various concepts so that you will be able to make an informed choice when deciding how to store data in your future projects.
Heres what well be covering: -
- An introduction to databases
- Querying data in databases
- Backing up databases
- High availability and disaster recovery
- Performance tuning
- Database security
- Monitoring and troubleshooting database issues
Well also be providing examples to accompany the concepts discussed. In order to do so you will need Docker Desktop installed. Docker can be downloaded here (https://www.docker.com/products/docker-desktop/) and is completely free.
Alternatives to Docker Desktop can be used (such as Rancher Desktop or Finch) but the examples will focus on Docker.
We'll be using a custom PostgreSQL image in the examples and connecting with pgAdmin: -
https://www.pgadmin.org/
<br>
# About Us
<b>Andrew Pruski</b><br>
Andrew is a Field Solutions Architect working for Pure Storage. He is a Microsoft Data Platform MVP, Certified Kubernetes Administrator, and Raspberry Pi tinkerer. You can find him on twitter @dbafromthecold, LinkedIn, and blogging at dbafromthecold.com
<b>Taylor Riggan</b><br>
Taylor is a Sr. Graph Architect on the Amazon Neptune development team at Amazon Web Services. He works with customers of all sizes to help them learn and use purpose-built NoSQL databases via the creation of reference architectures, sample solutions, and delivering hands-on workshops. You can find him on twitter @triggan and LinkedIn.
<br>
# Why databases?
The total amount of data created worldwide is predicted to reach 181 zetabytes by 2025.
Thats 181 billion terabytes!
![](images/day63-1.png)
source - https://www.statista.com/statistics/871513/worldwide-data-created/
Imagine if all that data was stored in flat files, for example excel sheets! OK, storing that data might not be such an issue, just save the file on a networked drive and all good! But what about when it comes to retrieving that data? What about updating a single record amongst hundreds, thousands, millions of files?
This is where database technologies come into play. Databases give us the ability to not only store data but to easily retrieve, update, and delete individual records.
<br>
# Relational databases
When it comes to databases, there are two main types...relational and non-relational (or NoSQL) databases.
SQL Server, Oracle, MySQL, and PostgreSQL are all types of relational databases.
Relational databases were first described by Edgar Codd in 1970 whilst he was working at IBM in a research paper , “A Relation Model of Data for Large Shared Data Banks”.
This paper led the way for the rise of the various different relational databases that we have today.
In a relational database, data is organised into tables (containing rows and columns) and these tables have “relationships” with each other.
For example, a Person table may have an addressID column which points to a row within an Address table, this allows for an end user or application to easily retrieve a record from the Person table and the related record from the Address table.
The addressID column is a unique “key” in the Address table but is present in the Person table as a “foreign key”.
The design of the tables and the relations between them in a relational database is said to be the database schema. The process of building this schema is called database normalisation.
Data is selected, updated, or deleted from a relational database via a programming language called SQL (Structured Query Language).
In order to support retrieving data from tables in a relational database, there is the concept of “indexes”. In order to locate one row or a subset of rows from a table, indexes provide a way for queries to quickly identify the rows they are looking for, without having to scan all the rows in the table.
The analogy often used when describing indexes is an index of a book. The user (or query) uses the index to go directly to the page (or row) they are looking for, without having to “scan” all the way through the book from the start.
Queries accessing databases can also be referred to as transactions…a logical unit of work that accesses and/or modifies the data. In order to maintain consistency in the database, transactions must have certain properties. These properties are referred to as ACID properties: -
A - Atomic - all of the transaction completes or none of it does<br>
C - Consistency - the data modified must not violate the integrity of the database<br>
I - Isolation - multiple transactions take place independently of one another<br>
D - Durability - Once a transaction has completed, it will remain in the system, even in the event of a system failure.
We will go through querying relational databases in the next blog post.
<br>
# Non-Relational databases
The downside of relational databases is that the data ingested has to "fit" to the structure of the database schema. But what if we're dealing with large amounts of data that doesn't match that structure?
This is where non-relational databases come into play. These types of databases are referred to as NoSQL (non-SQL or Not Only SQL) databases and are either schema-free or have a schema that allows for changes in the structure.
Apache Cassandra, MongoDB, and Redis are all types of NoSQL databases.
Non-relational databases have existed since the 1960s but the term “NoSQL” was used in 1998 by Carlo Strozzi when naming his Strozzi NoSQL database, however that was still a relational database. It wasnt until 2009 when Johan Oskarsson reintroduced the term when he organised an event to discuss “open-source distributed, non-relational databases”.
There are various different types of NoSQL databases, all of which store and retrieve data differently.
For example: -
Apache Cassandra is a wide-column store database. It uses tables, rows, and columns like a relational database but the names and formats of the columns can vary from row to row in the same table. It uses Cassandra Query Language (CSQL) to access the data stored.
MongoDB is a document store database. Data is stored as objects (documents) within the database that do not adhere to a defined schema. MongoDB supports a variety of methods to access data, such as range queries and regular expression searches.
Redis is a distributed in-memory key-value database. Redis supports many different data structures - sets, hashes, lists, etc. - https://redis.com/redis-enterprise/data-structures/
The records can be identified using a unique key. Redis supports various different programming languages in order to access the data stored.
NoSQL databases generally do not comply with ACID properties but there are exceptions.
Each has pros and cons when it comes to storing data, which one to use would be decided on the type of data that is being ingested.
<br>
# When to use relational vs non-relational databases
This is an interesting question and the answer, unfortunately, is it depends.
It all depends on the type of data being stored, where it is to be stored, and how it is to be accessed.
If you have data that is highly structured, stored in a central location, and will be accessed by complex queries (such as reports), then a relational database would be the right choice.
If however, the data is loosely-structured, needs to be available in multiple regions, and will be retrieved with a specific type of query (e.g.- a quick lookup in a key/value store), then a non-relational database would be the right choice.
There is a massive caveat with the statements above however…there are types of non-relational databases that can handle large, complex queries likewise relational databases have features that allow for data to be available in multiple regions.
It also comes down to the skillset of the people involved, for example, Andrew is a former SQL Server DBA…so we know what his default choice would be when choosing a type of database!
While, in contrast, Taylor works on the development team for one of the more popular, cloud-hosted, graph databases, so he is more likely to start with a NoSQL data store.
The great thing about databases is that there are so many choices to choose from within the realm of commercial offerings, cloud services, and the open-source ecosystem. The amount of choice, however, can be daunting for someone new to this space
Join us tommorrow when we'll be talking about querying databases.
Thanks for reading!

View File

@ -0,0 +1,290 @@
# Querying data in databases
Hello and welcome to the second post in the database part of the 90 Days of DevOps blog series!
In this post we will be going through spinning up an instance of PostgreSQL in a docker container, retrieving data, and then updating that data.
So lets get started!
<br>
# Software needed
To follow along with the scripts in this blog post, you will need docker installed and pgAdmin.
Both are completely free and can be downloaded here: -
Docker - https://www.docker.com/products/docker-desktop/ <br>
pgAdmin - https://www.pgadmin.org/
<br>
# Running PostgreSQL
We have created a custom PostgreSQL docker image which has a demo database ready to go.
In order to run the container, open a terminal and execute: -
docker run -d \
--publish 5432:5432 \
--env POSTGRES_PASSWORD=Testing1122 \
--name demo-container \
ghcr.io/dbafromthecold/demo-postgres:latest
This will pull the image down from our github repository and spin up an instance of PostgreSQL with a database, dvdrental, ready to go.
Note - the image size is 437MB which may or may not be an issue depending on your internet connection
Confirm the container is up and running with: -
docker container ls
Then open pgAdmin and connect with the server name as *localhost* and the password as *Testing1122*
<br>
# Selecting data
Once youve connected to the instance of PostgreSQL running in the container, lets look at the staff table in the dvdrental database. Right click on the dvdrental database in the left-hand menu and select Query Tool.
To retrieve data from a table we use a SELECT statement. The structure of a SELECT statement is this: -
SELECT data_we_want_to_retrieve
FROM table
WHERE some_condition
So to retrieve all the data from the staff table, we would run: -
SELECT *
FROM staff
The * indicates we want to retrieve all the columns from the table.
If we wanted to only retrieve staff members called “Mike” we would run: -
SELECT *
FROM staff
WHERE first_name = Mike
OK, now lets look at joining two tables together in the SELECT statement.
Here is the relationship between the staff and address tables: -
![](images/day64-1.png)
From the Entity Relational Diagram (ERD), which is a method of displaying tables in a relational database, we can see that the tables are joined on the address_id column.
The address_id column is a primary key in the address table and a foreign key in the staff table.
We can also see (by looking at the join) that this is a many-to-one relationship…aka rows in the address table can be linked to more than one row in the staff table.
Makes sense as more than one member of staff could have the same address.
Ok, in order to retrieve data from both the staff and address tables we join them in our SELECT statement: -
SELECT *
FROM staff s
INNER JOIN address a ON s.address_id = a.address_id
That will retrieve all the rows from the staff table and also all the corresponding rows from the address table…aka we have retrieved all staff members and their addresses.
Lets limit the query a little. Lets just retrieve some data from the staff table and some from the address table for one staff member
SELECT s.first_name, s.last_name, a.address, a.district, a.phone
FROM staff s
INNER JOIN address a ON s.address_id = a.address_id
WHERE first_name = Mike
Here we have only retrieved the name of any staff member called Mike and their address.
You may have noticed that when joining the address table to the staff table we used an <b>INNER JOIN</b>.
This is a type of join that specifies only to retrieve rows in the staff table that has a corresponding row in the address table.
The other types of joins are: -
<b>LEFT OUTER JOIN</b> - this would retrieve data in the staff table even if there was no row in the address table
<b>RIGHT OUTER JOIN</b> - this would retrieve data in the address table even if there was no row in the staff table
<b>FULL OUTER JOIN</b> - this would retrieve all data from the tables even if there was no corresponding matching row in the other table
If we run: -
SELECT *
FROM staff s
RIGHT OUTER JOIN address a ON s.address_id = a.address_id
We will get all the rows in the address table that do not have a corresponding row in the staff table.
But, if we run: -
SELECT *
FROM staff s
LEFT OUTER JOIN address a ON s.address_id = a.address_id
We will still only get records in the staff table that have a record in the address table.
<br>
# Inserting data
This is due to the FOREIGN KEY constraint linking the two tables, if we tried to insert a row into the staff table and the address_id we specified did not exist in the address table we would get an error: -
ERROR: insert or update on table "staff" violates foreign key constraint "staff_address_id_fkey"
This is because the foreign key is saying that a record in the staff table must reference a valid row in the address table.
Enforcing this relationship is enforcing the referential integrity of the database…i.e. - maintaining consistent and valid relationships between the data in the tables.
So we have to add a row to the staff table that references an existing row in the address table.
So an example of this would be: -
INSERT INTO staff(
staff_id, first_name, last_name, address_id,
email, store_id, active, username, password, last_update, picture)
VALUES
(999, 'Andrew', 'Pruski', 1, 'andrew.pruski@90daysofdevops.com',
'2', 'T', 'apruski', 'Testing1122', CURRENT_DATE, '');
Notice that we specify all the columns in the table and then the corresponding values.
To verify that the row has been inserted: -
SELECT s.first_name, s.last_name, a.address, a.district, a.phone
FROM staff s
INNER JOIN address a ON s.address_id = a.address_id
WHERE first_name = 'Andrew'
And there is our inserted row!
<br>
# Updating data
To update a row in a table we use a statement in the format: -
UPDATE table
SET column = new_value
WHERE some_condition
OK, now lets update the row that we inserted previously. Say the staff members email address has changed. To view the current email address: -
SELECT s.first_name, s.last_name, s.email
FROM staff s
WHERE first_name = 'Andrew'
And we want to change that email value to andrewxpruski@outlook.com. To update that value: -
UPDATE staff
SET email = 'apruski@90daysofdevops.com'
WHERE first_name = 'Andrew'
You should see that one row has been updated in the output. To confirm run the SELECT statement again: -
SELECT s.first_name, s.last_name, s.email
FROM staff s
WHERE first_name = 'Andrew'
<br>
# Deleting data
To delete a row from a table we use a statement in the format: -
DELETE FROM table
WHERE some_condition
So to delete the row we inserted and updated previously, we can run: -
DELETE FROM staff
WHERE first_name = Andrew
You should see that one row was deleted in the output. To confirm: -
SELECT s.first_name, s.last_name, s.email
FROM staff s
WHERE first_name = 'Andrew'
No rows should be returned.
<br>
# Creating tables
Lets have a look at the definition of the staff table. This can be scripted out by right-clicking on the table, then Scripts > CREATE Script
This will open a new query window and show the statement to create the table.
Each column will be listed with the following properties: -
Name - Data type - Constraints
If we look at the address_id column we can see: -
address_id smallint NOT NULL
So we have the column name, that it is a smallint data type (https://www.postgresql.org/docs/9.1/datatype-numeric.html), and that it cannot be null. It cannot be null as a FOREIGN KEY constraint is going to be created to link to the address table.
The columns that are a character datatype also have a COLLATE property. Collations specify case, sorting rules, and accent sensitivity properties. Each character datatype here is using the default setting, more information on collations can be found here: -
https://www.postgresql.org/docs/current/collation.html
Other columns also have default values specified…such as the <b>last_update</b> column: -
last_update timestamp without time zone NOT NULL DEFAULT 'now()'
This says that if no value is set for the column when a row is inserted, the current time will be used.
Then we have a couple of constraints defined on the table. Firstly a primary key: -
CONSTRAINT staff_pkey PRIMARY KEY (staff_id)
A primary key is a unique identifier in a table, aka this row can be used to identify individual rows in the table. The primary key on the address table, address_id, is used as a foreign key in the staff table to link a staff member to an address.
The foreign key is also defined in the CREATE TABLE statement: -
CONSTRAINT staff_address_id_fkey FOREIGN KEY (address_id)
REFERENCES public.address (address_id) MATCH SIMPLE
ON UPDATE CASCADE
ON DELETE RESTRICT
We can see here that the address_id column in the staff table references the address_id column in the address table.
The ON UPDATE CASCADE means that if the address_id in the address table is updated, any rows in the staff table referencing it will also be updated. (Note - its very rare that you would update a primary key value in a table, Im including this here as its in the CREATE TABLE statement).
The ON DELETE RESTRICT prevents the deletion of any rows in the address table that are referenced in the staff table. This prevents rows in the staff table having references to the rows in the address table that are no longer there…protecting the integrity of the data.
OK so lets create our own table and import some data into it: -
CREATE TABLE test_table (
id smallint,
first_name VARCHAR(50),
last_name VARCHAR(50),
dob DATE,
email VARCHAR(255),
CONSTRAINT test_table_pkey PRIMARY KEY (id)
)
NOTE - VARCHAR is an alias for CHARACTER VARYING which we saw when we scripted out the staff table
Ok, so we have a test_table with 6 columns and a primary key (the id column).
Lets go and import some data into it. The docker image that we are using has a test_data.csv file in the /dvdrental directory and we can import that data with: -
COPY test_table(id,first_name, last_name, dob, email)
FROM '/dvdrental/test_data.csv'
DELIMITER ','
CSV HEADER;
To verify: -
SELECT * FROM test_table
So thats how to retrieve, update, and delete data from a database. We also looked at creating tables and importing data.
Join us tommorrow where we will be looking at backing up and restoring databases.
Thank you for reading!

View File

@ -0,0 +1,253 @@
# Backing up and restoring databases
Hello and welcome to the third post in the database part of the 90 Days of DevOps blog series! Today well be talking about backing up and restoring databases.
One of the (if not the) most vital tasks a Database Administrator performs is backing up databases.
Things do go wrong with computer systems not to mention with the people who operate them 🙂 and when they do, we need to be able to recover the data.
This is where backups come in. There are different types of backups, and different types of databases perform their backups in different ways…but the core concepts are the same.
The core backup is the full backup. This is a backup of the entire database, everything, at a certain point in time. These backups are the starting point of a recovery process.
Then there are incremental/differential backups. Certain databases allow for this type of backup to be taken which only includes the data changes since the last full backup. This type of backup is useful when dealing with large databases and taking a full backup is a long process. In the recovery process, the full backup is restored first and then the applicable incremental/differential backup.
Another important type of backup is backing up the “log”. Databases have a log of transactions that are executed against the data stored. Typically the log is written to before any data changes are made so that in the event of a failure, the log can be “replayed” (aka rolling forward any committed transactions in the log, and rolling back any uncommitted transactions) so that the database comes back online in a consistent state.
Log backups allow database administrators to achieve point-in-time recovery. By restoring the full backup, then any incremental/differential backups, and then subsequential log backups, the DBA can roll the database to a certain point in time (say before a data loss event to recover data).
Backups are kept separate from the server that hosts the databases being backed up. You dont want a server to go down and take its database backups with it! Typically backups will be stored in a centralised location and then shipped to a 3rd site (just in case the whole primary site goes down).
One motto of DBAs is “Its not a backup process, its a recovery process” (or so Andrew says). Meaning that backing up the databases is useless if those backups cannot be restored easily. So DBAs will have a whole host of scripts ready to go if a database needs to be restored to make the process as painless as possible. You really dont want to be scrabbling around looking for your backups when you need to perform a restore!
Lets have a look at backing up and restoring a database in PostgreSQL.
<br>
# Setup
For the demos in this blog post well be using the dvdrental database in the custom PostgreSQL docker image. To spin this image up, start docker, open a terminal, and run: -
docker run -d \
--publish 5432:5432 \
--env POSTGRES_PASSWORD=Testing1122 \
--name demo-container \
ghcr.io/dbafromthecold/demo-postgres:latest
Note - the image size is 497MB which may or may not be an issue depending on your internet connection
<br>
# Taking a full backup
Starting with the simplest of the backup types, the full backup. This is a copy (or dump) of the database into a separate file that can be used to roll the database back to the point that the backup was taken.
Lets run through taking a backup of the *dvdrental* database in the PostgreSQL image.
Connect into the server via pgAdmin (server name is *localhost* and the password is *Testing1122*), right-click on the *dvdrental* database and select *Backup*
![](images/day65-1.png)
Enter a directory and filename on your local machine to store the backup file (in this example, Im using *C:\temp\dvdrental.backup*).
Then hit Backup! Nice and simple!
If we click on the Processes tab in pgAdmin, we can see the completed backup. Whats nice about this is that if we click on the file icon, it will give us a dialog box of the exact command executed, and a step-by-step log of the process run.
![](images/day65-2.png)
The process used a program called pg_dump (https://www.postgresql.org/docs/current/backup-dump.html) to execute the backup and store the files in the location specified.
This is essentially an export of the database as it was when the backup was taken.
<br>
# Restoring the full backup
Ok, say that database got accidentally dropped (it happens!)…we can use the backup to restore it.
Open a query against the PostgreSQL database and run: -
DROP DATABASE dvdrental
Note - if that throws an error, run this beforehand: -
SELECT pg_terminate_backend(pg_stat_activity.pid)
FROM pg_stat_activity
WHERE pg_stat_activity.datname = 'dvdrental'
AND pid <> pg_backend_pid();
OK, now we need to get the database back. So create a new database: -
CREATE DATABASE dvdrental_restore
Execute that command and then right-click on the newly created database in the left-hand menu, then select *Restore*
![](images/day65-3.png)
Select the filename of the backup that we performed earlier, then click Restore.
Let that complete, refresh the database on the left-hand menu…and there we have it! Our database has been restored…all the tables and the data is back as it was when we took the backup!
As with the backup, we can see the exact command used to restore the database. If we click on the processes tab in pgAdmin, and then the file icon next to our restore, we will see: -
![](images/day65-4.png)
Here we can see that a program called pg_restore was used to restore the database from the backup file that we created earlier.
<br>
# Point in time restores
Now that weve run through taking a full backup and then restoring that backup…lets have a look at performing a point-in-time restore of a database.
Full backups are great to get the data at a certain point…but we can only get the data back to the point when that full backup was taken. Typically full backups are run once daily (at the most) so if we had a data loss event several hours after that backup was taken…we would lose all data changes made since that backup if we just restored the full backup.
In order to recover the database to a point in time after the full backup was taken we need to restore additional backups to “roll” the database forward.
We can do this in PostgreSQL as PostgreSQL maintains a write ahead log (WAL) that records every change (transaction) made to the database. The main purpose of this log is that if the server crashes the database can be brought back to a consistent state by replaying the transactions in the log.
But this also means that we can archive the log (or WAL files) and use them to perform a point in time restore of the database.
Lets run through setting up write ahead logging and then performing a point in time restore.
First thing to do is run a container with PostgreSQL installed: -
docker run -d \
--publish 5432:5432 \
--env POSTGRES_PASSWORD=Testing1122 \
--name demo-container \
ghcr.io/dbafromthecold/demo-postgres:latest
Jump into the container: -
docker exec -it -u postgres demo-container bash
In this container image there are two locations that we will use for our backups.
*/postgres/archive/base* for the baseline backup and */Postgres/archive/wal* for the log archive.
Now were going to edit the *postgresql.conf* file to enable WAL archiving: -
vim $PGDATA/postgresql.conf
Drop the following lines into the config file: -
archive_mode = on
archive_command = 'cp %p /postgres/archive/wal/%f'
- <b>archive_mode</b> - enables WAL archiving
- <b>archive_command</b> - the command used to archive the WAL files (%p is replaced by the path name of the file to archive, and any %f is replaced by the file name).
Exit the container and restart to enable WAL archiving: -
docker container restart demo-container
OK, the next thing to do is take a base backup of the database cluster. Here we are using pg_basebackup (https://www.postgresql.org/docs/current/app-pgbasebackup.html) which is different from the command used to take a full backup as it backs up all the files in the database cluster.
Aka its a file system backup of all the files on the server whereas the full backup used pg_dump which is used to backup only one database.
Jump back into the container: -
docker exec -it -u postgres demo-container bash
And take the backup: -
pg_basebackup -D /postgres/archive/base
We will use the files taken in this backup as the starting point of our point in time restore.
To test our point in time restore, connect to the database dvdrental in pgAdmin (server is localhost and password is Testing1122), create a table, and import sample data (csv file is in the container image): -
CREATE TABLE test_table (
id smallint,
first_name VARCHAR(50),
last_name VARCHAR(50),
dob DATE,
email VARCHAR(255),
CONSTRAINT test_table_pkey PRIMARY KEY (id)
)
COPY test_table(id,first_name, last_name, dob, email)
FROM '/dvdrental/test_data.csv'
DELIMITER ','
CSV HEADER;
Confirm the data is in the test table: -
SELECT * FROM test_table
![](images/day65-5.png)
What were going to do now is simulate a data loss event. For example, an incorrect DELETE statement executed against a table that removes all the data.
So wait a few minutes and run (make a note of the time): -
DELETE FROM test_table
Ok, the data is gone! To confirm: -
SELECT * FROM test_table
![](images/day65-6.png)
We need to get this data back! So, jump back into the container: -
docker exec -it -u postgres demo-container bash
The first thing to do in the recovery process is create a recovery file in the location of our base backup: -
touch /postgres/archive/base/recovery.signal
This file will automatically get deleted when we perform our point in time restore.
Now we need to edit the *postgresql.conf* file to tell PostgreSQL to perform the recovery: -
vim $PGDATA/postgresql.conf
Add in the following to the top of the file (you can leave the WAL archiving options there): -
restore_command = 'cp /postgres/archive/wal/%f %p'
recovery_target_time = '2023-02-13 10:20:00'
recovery_target_inclusive = false
data_directory = '/postgres/archive/base'
- <b>restore_command</b> - this is the command to retrieve the archived WAL files
- <b>recovery_target_time</b> - this is the time that we are recovering to (change for just before you executed the DELETE statement against the table)
- <b>recovery_target_inclusive</b> - this specifies to stop the recovery before the specified recovery time
- <b>data_directory</b> - this is where we point PostgreSQL to the files taken in the base backup
Now we need to tell PostgreSQL to switch to a new WAL file, allowing for the old one to be archived (and used in recovery): -
psql -c "select pg_switch_wal();"
Almost there! Jump back out of the container and then restart: -
docker container restart demo-container
If we check the logs of the container, we can see the recovery process: -
docker container logs demo-container
![](images/day65-7.png)
We can see at the end of the logs that we have one more thing to do, so, one more time, jump back into the container: -
docker exec -it -u postgres demo-container bash
And resume replay of the WAL files: -
psql -c "select pg_wal_replay_resume();"
OK, finally to confirm, in pgAdmin, connect to the dvdrental database and run: -
SELECT * FROM test_table
![](images/day65-8.png)
The data is back! We have successfully performed a point in time restore of our database!
Join us tomorrow where we will be talking about high availability and disaster recovery.
Thanks for reading!

View File

@ -0,0 +1,209 @@
# High availability and disaster recovery
Hello and welcome to the fourth post in the database part of the 90 Days of DevOps blog series! Today well be talking about high availability and disaster recovery.
One of the main jobs of a database administrator is to configure and maintain disaster recovery and high availability strategies for the databases that they look after. In a nutshell they boil down to: -
<b>Disaster recovery (DR)</b> - recovering databases in the event of a site outage.<br>
<b>High availability (HA)</b> - ensuring databases stay online in the event of a server outage.
Lets go through both in more detail.
<br>
# Disaster recovery
Database administrators are a paranoid bunch (Andrew nodding his head). Its their job to think about how the database servers will fail and how best to recover from that failure.
Two main factors come into play when thinking about disaster recovery…
<b>RTO - Recovery Time Objective</b> - How long can the databases be offline after a failure?<br>
<b>RPO - Recovery Point Objective</b> - How much data can be lost in the event of a failure?
Basically, RTO is how quickly do we need to get the databases online after a failure and RPO is can we lose any data in the event of a failure?
In the last post we talked about backing up and restoring databases…and backups could be good enough for a disaster recovery strategy. Now, theres a load of caveats with that statement!
In the event of a site outage…can we easily and quickly restore all of the databases with the RTO and to the RPO? More often than not, for anyone looking after anything more than a couple of small databases, the answer is no.
So an alternate strategy would need to be put into place.
A common strategy is to have what is known as a “warm standby”. Another server is spun up in a different site to the primary server (could potentially be another private data centre or the cloud) and a method of pushing data changes to that server is put into place.
Theres a couple of methods of doing this…one is referred to as “log shipping”. A full backup of the database is restored to the disaster recovery server and then the logs of the database are “shipped” from the primary server and restored to the secondary.
In the event of an outage on the primary site, the secondary databases are brought online and the applications pointed to that server.
This means that the databases can be brought online in a relatively short period of time but caution needs to be taken as there can be data loss with this method. It depends on how often the logs are being restored to the secondary…which is where the RPO comes into play.
The Database Administrator needs to ensure that the logs are shipped frequently enough to the secondary so that in the event of a primary site outage, the amount of data loss falls within the RPO.
Another method of keeping a warm standby is asynchronous replication or mirroring. In this method, the full backup of the database is restored as before but then transactions are sent to the secondary when they are executed against the primary server.
Again, data loss can occur with this method as there is no guarantee that the secondary is “up-to-date” with the primary server. The transactions are sent to the secondary and committed on the primary…with no waiting from the secondary to acknowledge that the transaction has been committed there. This means that the secondary can lag behind the primary…the amount of lag would be determined by the network connectivity between the primary and secondary sites, the amount of transactions hitting the primary, and the amount of data being altered.
<br>
# High availability
Disaster recovery strategies really do mean recovering from a “disaster”, typically the entire primary site going down.
But what if just one server goes down? We wouldnt want to enact our DR strategy just for that one server…this is where high availability comes in.
High availability means that if a primary server goes down, a secondary server will take over (pretty much) instantly…with no data loss.
In this setup, a primary server and one or more secondary servers are set up in a group (or cluster). If the primary server has an issue…one of the secondaries will automatically take over.
There are various different methods of setting this up…PostgreSQL and MySQL have synchronous replication and this is the method we will focus on here.
Synchronous replication means that when a transaction is executed against the primary server, it is also sent to the secondaries, and the primary waits for acknowledgement from the secondaries that they have committed the transaction before committing it itself.
Now, this means that the network between the primary and secondaries has to be able to handle the amount of transactions and data that is being sent between all the servers in the cluster because if the secondaries take a long time to receive, commit, and acknowledge transactions the transactions on the primary will take longer as well.
Lets have a look at setting up replication between two instances of PostgreSQL.
<br>
# Setting up replication for PostgreSQL
What were going to do here is spin up two containers running PostgreSQL and then get replication set up from the “primary” to the “secondary”.
One thing were not going to do here is configure the servers for automatic failover, i.e. - the secondary server taking over if there is an issue with the primary.
As noted in the PostgreSQL documentation (https://www.postgresql.org/docs/current/warm-standby-failover.html), PostgreSQL does not natively implement a system to provide automatic failover, external tools such as PAF (http://clusterlabs.github.io/PAF/) have to be used…so well skip that here and just get replication working.
First thing to do is create a docker custom bridge network: -
docker network create postgres
This will allow our two containers to communicate using their names (instead of IP addresses)
Now we can run our first container on the custom network which is going to be our “primary” instance: -
docker run -d
--publish 5432:5432
--network=postgres
--volume C:\temp\base:/postgres/archive/base
--env POSTGRES_PASSWORD=Testing1122
--name demo-container
ghcr.io/dbafromthecold/demo-postgres
This container run statement is a little different than the ones used in the previous blog posts.
Weve included: -
<b>--network=postgres</b> - this is the custom docker network that weve created<br>
<b>-v C:\temp\base:/postgres/archive/base</b> - mounting a directory on our local machine to /postgres/archive/base in the container. This is where we will store the base backup for setting up the secondary. Change the location based on your local machine, Im using C:\temp\base in this example.
Now exec into the container: -
docker exec -it -u postgres demo-container bash
We need to update the pg_hba.conf file to allow connections to our secondary instance: -
vim $PGDATA/pg_hba.conf
Add in the following lines to the top of the file: -
# TYPE DATABASE USER ADDRESS METHOD
host replication replicator 172.18.0.1/24 trust
<b>172.18.0.1/24</b> is the address range of containers on the custom network. If you have other custom docker networks this will change (confirm the address of the primary container with docker container inspect demo-container)
OK, connect to the primary container in pgAdmin (server is *localhost* and password is *Testing1122*) and create a user for replication: -
CREATE USER replicator WITH REPLICATION ENCRYPTED PASSWORD 'Testing1122';
Then create a slot for replication: -
SELECT * FROM pg_create_physical_replication_slot('replication_slot_slave1');
N.B. - Replication slots provide an automated way to ensure that a primary server does not remove WAL files until they have been received by the secondaries. Aka they ensure that the secondaries remain up-to-date.
Confirm that the slot has been created: -
SELECT * FROM pg_replication_slots;
![](images/day66-1.png)
Now back in the container, we take a base backup: -
pg_basebackup -D /postgres/archive/base -S replication_slot_slave1 -X stream -U replicator -Fp -R
Alright, whats happening here?
<b>-D /postgres/archive/base</b> - specify the location for the backup<br>
<b>-S replication_slot_slave1</b> - specify the replication slot we created (N.B. - this uses out-of-date terminology which will hopefully be changed in the future)<br>
<b>-X stream</b> - Include WAL files in backup (stream whilst the backup is being taken)<br>
<b>-U replicator</b> - specify user<br>
<b>-Fp</b> - specify format of the output (plain)<br>
<b>-R</b> - creates the standby.signal file in the location of the directory (for setting up the standby server using the results of the backup)<br>
More information about these parameters can be found here: -
https://www.postgresql.org/docs/current/app-pgbasebackup.html
Now we are ready to create our secondary container.
docker run -d
--publish 5433:5432
--network=postgres
--volume C:\temp\base:/var/lib/postgresql/data
--env POSTGRES_PASSWORD=Testing1122
--name demo-container2
ghcr.io/dbafromthecold/demo-postgres
Again this container run statement is a little different than before. Were on the custom network (as with the first container) but we also have: -
<b>-p 5433:5432</b> - changing the port that we connect to the instance on as we already have our primary container on port 5432.<br>
<b>-v C:\temp\base:/var/lib/postgresql/data</b> - this is saying to use the directory that we stored our base backup as the data location for the postgres instance in the secondary. Were doing this so we dont have to copy the base backup into the secondary container and change the default data directory.
Once the secondary is running, jump into it: -
docker exec -it -u postgres demo-container2 bash
And open the postgresql.auto.conf file: -
vim $PGDATA/postgresql.auto.conf
Here we are going to add in information about the primary container. Replace the *primary_conninfo* line with: -
primary_conninfo = 'host=demo-container port=5432 user=replicator password=Testing1122'
Exit out of the container and restart both the primary and secondary: -
docker container restart demo-container demo-container2
Were now ready to test replication from the primary container to the secondary! Connect to the *dvdrental* database in the primary container in pgAdmin (server is *localhost* and password is *Testing1122*).
Create a test table and import some data: -
CREATE TABLE test_table (
id smallint,
first_name VARCHAR(50),
last_name VARCHAR(50),
dob DATE,
email VARCHAR(255),
CONSTRAINT test_table_pkey PRIMARY KEY (id)
)
COPY test_table(id,first_name, last_name, dob, email)
FROM '/dvdrental/test_data.csv'
DELIMITER ','
CSV HEADER;
Then connect to the dvdrental database in pgAdmin in the secondary container (server name and password are the same as the primary container but change the port to *5433*).
Run the following to check the data: -
SELECT * FROM test_table
![](images/day66-2.png)
And theres the data. We have successfully configured replication between two instances of PostgreSQL!
You can further test this by deleting the data on the primary and querying the data on the secondary.
Join us tomorrow where we'll be talking about performance tuning.
Thanks for reading!

View File

@ -0,0 +1,138 @@
# Performance tuning
Hello and welcome to the fifth post in the database part of the 90 Days of DevOps blog series! Today well be talking about performance tuning.
Performance tuning is a massive area in the database field. There are literally thousands of books, blog posts, videos, and conference talks on the subject. People have made (and are still making) careers out of performance tuning databases.
Were not going to cover everything here, its pretty much impossible to do in one blog post so what well do is talk about the main areas of focus when it comes to ensuring that the database systems we are looking after hits the performance target required.
<br>
# Server performance tuning
Andrew always tells people to know their environment completely when it comes to approaching performance tuning.
This means we need to start off by looking at the hardware our database is running on.
This used to be “relatively” simple. We had physical servers attached to storage where we would install an OS, and then install our database engine. Here were concerned with the specifications of the server, CPU, Memory, and Storage.
<b>CPU</b> - does the server have enough compute power to handle the amount of transactions the database engine will be executing?
<b>Memory</b> - database systems cache data in memory to perform operations (certain ones work entirely in memory - Redis for example). Does the server have enough memory to handle the amount of data that itll be working with?
<b>Storage</b> - is the storage available to the server fast enough so that when data is requested from disk it can server up that data with minimal latency?
Nowadays the most common setup for database servers is running on a virtual machine. A physical machine is carved up into virtual machines in order to make resource usage more efficient, improve manageability, and reduce cost.
But this means that we have another layer to consider when looking at getting the maximum performance out of our servers.
Not only do we have the same areas to look at with physical machines (CPU, Memory, Storage) but we also now have to consider the host that the virtual machine is running on.
Does that have enough resources to handle the traffic that the virtual machine will be running? What other virtual machines are on the host that our database server is on? Is the host oversubscribed (i.e. - the virtual machines on the host have more resources assigned to them than the actual physical host has)?
Running database engines in containers is now becoming more popular (as we have been doing in the demos for this series). However just running a database server in one container can lead to issues (see this series post on high availability). For production workloads, a container orchestrator is used. There are a few types out there but the main one that has come to the front is Kubernetes.
So this means that we have a whole bunch of other considerations when thinking about performance for our database engine.
What spec are the hosts in the Kubernetes cluster? Will they have enough resources to handle the traffic of our database engine? What else is running on that cluster? Have we got the correct setting in our deployment manifest for our database engine?
Knowing your environment completely is the first step in building a database server that will perform to the required standard. Once we know we have built a server that can handle the transactions hitting our databases we can move onto the next step, tuning the database engine itself.
<br>
# Database engine performance tuning
Database systems come with a huge variety of different settings that can be used to tune them. Now there are settings that Database Administrators will 100% say that need to be altered for all workloads, no matter what they are, but then there are others that depend on the workload itself.
For example, the memory available to the database engine may not be enough for the workload that the system will be dealing with, in which case it will need to be increased. Or conversely, it may not be limited at all…which would allow the database engine to consume all the memory on the server, starving the OS and leading to issues…so that would need to be limited.
Getting the right configuration settings can be a daunting process…especially anyone new to the particular database system being tuned. This is where development environments come into play…having a development environment that is (somewhat) similar to a production environment allows Database Administrators to make configuration changes and monitor.
Ok, the challenge here is that typically, development environments do not get the throughput that production environments do, and the databases are smaller as well.
To get around this, there are a host of tools out there that can simulate workload activity. A DBA would run a tool to get a baseline of the performance of the system, make some configuration changes, then run the tool again to see what (if any 🙂 ) the increase in performance is.
Once the database engine is configured we can then move onto the next area of performance tuning, query performance.
<br>
# Query performance tuning
Even with a powerful server and properly configured database engine, query performance can still be poor.
Thankfully there are a host of tools that can capture queries hitting databases and report on their performance. If a particular query starts suffering, a DBA needs to go and analyse what has gone wrong.
When a query hits a database, an execution plan for that query is generated…an execution plan is how the data will be retrieved from the database.
The plan is generated from statistics stored in the database which could potentially be out of date. If they are out of date then the plan generated will result in the query being inefficient.
For example…if a large dataset has been inserted into a table, the statistics for that table may not have been updated, resulting in any queries to that table then having an inefficient plan generated as the database engine does not know about the new data that has been inserted.
Another key factor when it comes to query performance is indexing. If a query hits a large table and does not have a supporting index, it will scan through every row in the table until it finds the row required…not an efficient way to retrieve data.
Indexes solve this problem by pointing the query to the correct row in the table. They are often described as the index of a book. Instead of a reader going through each page in a book to find what they need, they simply go to the index, find the entry in the index that points them to the page in the book they are looking for, and then go straight to that page.
So the key questions a Database Administrator will ask when troubleshooting query performance are…are the statistics up to date? Are there any supporting indexes for this query?
It can be tempting to add indexes to a table to cover all queries hitting it, however when data in a table is updated, any indexes on that table need to be updated as well so there is a performance hit on INSERT/UPDATE/DELETE queries. Its all about finding the correct balance.
Lets have a look at indexes in the *dvdrental* database.
Run a container: -
docker run -d \
--publish 5432:5432 \
--env POSTGRES_PASSWORD=Testing1122 \
--name demo-container \
ghcr.io/dbafromthecold/demo-postgres:latest
Connect to the dvdrental database in pgAdmin (server is *localhost* and password is *Testing1122*). Open a new query window and run the following: -
SELECT * FROM actor WHERE last_name = Cage
We can see that 200 rows are returned. If we want to see the execution plan being used we can hit the Explain button: -
![](images/day67-1.png)
Here we can see that the plan is really simple. We have just one operation, a scan on the actor table.
However, if we look at the actor table in the left hand side menu, there is an index on the last_name column!
![](images/day67-2.png)
So why didnt the query use that index?
This is due to the size of the table…it only has 200 rows so the database engine decided that a full table scan would be more efficient than doing an index lookup. This is just one of the very many nuances of query performance!
Lets force PostgreSQL to use that index. Run the following: -
SET enable_seqscan=false
NOTE - this setting is just for developing queries to see if a particular query would use an index on a large dataset. Dont go doing this in a production environment!
Then highlight the SELECT statement and hit the Explain button: -
![](images/day67-3.png)
And there we can see that the query now is using the index and then going back to the table! So if the dataset here was larger, we know that we have an index to support that query.
Ok, but what about querying on first_name in the actor table: -
SELECT * FROM actor WHERE first_name = 'Nick'
![](images/day67-4.png)
Here we can see that were back to the table scan. Theres no supporting index on the first_name column!
Lets create one: -
CREATE INDEX idx_actor_first_name ON public.actor (first_name)
Now explain the SELECT statement on first_name again: -
![](images/day67-5.png)
And there we have it! Our query now has a supporting index!
Join us tommorrow where we'll be talking about database security.
Thanks for reading!

View File

@ -0,0 +1,207 @@
# Database security
Hello and welcome to the sixth post in the database part of the 90 Days of DevOps blog series! Today well be talking about database security.
Controlling access to data is an incredibly important part of being a Database Administrator. DBAs need to prevent unauthorised access to the data that they are responsible for. In order to do this all the different levels of access to a database, starting with the server, then the database engine, and then the data itself need to be controlled.
Lets go through each layer of security.
<br>
# Server security
The first area to look at when it comes to database security is who has access to the server that the database instance is running on.
Typically the System Administrators will have admin access to the server along with the Database Administrators. Now, DBAs wont like this…but do they really need admin access to the servers? Its a toss up between who will support any server issues…if thats solely down to the sysadmins, then the DBAs do not need admin access to the servers (Andrews eye is twitching :-) ).
Next thing to consider is the account that the database service is running under. Andrew has seen multiple installs of SQL Server where the account the database engine runs under is a local admin on the server and worse…the same account is used on multiple servers…all with admin access!
DBAs do this so that they dont run into any security issues when the database engine tries to access resources on the server, but its not secure.
Database services should not run under an admin account. The account they run under should only have access to the resources that it needs. Permissions should be explicitly granted to that account and monitored.
The reason for this is that if the account becomes compromised, it does not have full access to the server. Imagine if an admin account that was used for multiple servers was compromised, that would mean all the servers that used that account would be vulnerable!
Individual accounts for each database service should be used with only the permissions required granted.
That way if one is compromised, only that server is affected and we can be (fairly) confident that the other servers in our environment are not at risk.
<br>
# Database security
The next level of security is who can access the databases on the server? Each database engine will have another level of security on top of who can access the server itself.
Certain database engines will allow local administrators of the server to have full access to the databases, this needs to be disabled. The reason for this is that if the server becomes compromised then access to the databases isnt automatically granted.
Database Administrators need to work out who needs access to the databases on the server and what level of access they should be given. For instance, the System Administrators have full access to the server but do they need full access to the databases? More often than not, no they dont.
Not only do DBAs need to work out who needs access but also what needs access. There will be application accounts that need to retrieve data from the databases. There could also be reporting and monitoring tools that need access.
Application accounts should only have access to the databases they require and reporting/monitoring tools may need access to all the databases on the server but only require read-only access. Furthermore, applications and tools may only need access to certain tables within each database so DBAs would need to restrict access even further.
Again the reason for this is that if an account becomes compromised, the damage is limited.
Lets have a look at creating a custom user in PostgreSQL and assigning access rights.
<br>
# Creating a custom user in PostgreSQL
PostgreSQL uses the concept of roles to manage database access. A role can be either a user or a group of users and have permissions assigned to it. Then roles can be granted membership to other roles.
The concept of roles replaces users and groups in PostgreSQL but for simplicity here what were going to do is create a new user and grant it membership to a pre-defined role.
Spin up a container running PostgreSQL: -
docker run -d \
--publish 5432:5432 \
--env POSTGRES_PASSWORD=Testing1122 \
--name demo-container \
ghcr.io/dbafromthecold/demo-postgres
Connect to PostgreSQL in pgAdmin (server is *localhost* and password is *Testing1122*). Open a query window and run the following to view the existing users: -
SELECT usename FROM pg_user;
![](images/day68-1.png)
As we can see, there is only one user at the moment. This is the default postgres user that has admin rights. We dont want anyone else using this account so lets set up a new one.
To create a new custom user: -
CREATE USER test_user WITH PASSWORD 'Testing1122'
OK, confirm that the new user is there: -
SELECT usename FROM pg_user;
![](images/day68-2.png)
Great! Our user is there, now we need to assign some permissions to that user. There are default roles within PostgreSQL that can be used to assign permissions. To view those roles: -
SELECT groname FROM pg_group;
![](images/day68-3.png)
For more information on these roles: -
https://www.postgresql.org/docs/current/predefined-roles.html
Now, these are default roles. They may be OK for our user but they also might grant more permissions than needed. For a production instance of PostgreSQL, custom roles can (should) be created that only grant the exact permissions needed for an account. But for this demo, well use one of the defaults.
Grant read to the custom user: -
GRANT pg_read_all_data TO test_user;
Log into the container in pgAdmin with the custom users credentials, connect to the dvdrental database and open the query tool.
Try running a SELECT statement against the actor table in the database: -
SELECT * FROM actor
![](images/day68-4.png)
The data is returned as the user has access to read from any table in the database. Now try to update the data: -
UPDATE actor SET last_name = 'TEST'
![](images/day68-5.png)
An error is returned as the user does not have write access to any table in the database.
Database Administrators must always ensure that users/applications only have the access that they need to the databases and within PostgreSQL, roles are how that is achieved.
<br>
# Data encryption
The next level of security that needs to be considered is data encryption. There are different levels of encryption that can be applied to data. First option is to encrypt the entire database.
If someone managed to gain access to the server, they could copy the database files off-site, and then try to gain access to the data itself.
By encrypting part or all of the database, without the relevant encryption keys, an attacker would not be able (or be very unlikely to) gain access to the data.
If not all the data in a database is sensitive, then only certain columns within a database can be encrypted. For example, when storing login details for users in a database, the password for those users should (at a minimum) be encrypted.
Then we also need to consider how the data is being accessed. Any application accessing sensitive data should be using a secure connection. Theres no point in having the data encrypted in the database and then having it being sent across the network decrypted!
Another area to consider encryption is backups. An attacker would not have to target the database server to gain access to the data, they could attempt to gain access to where the database backups are stored. If they gain that access, all they have to do is copy off-site and restore the backups.
Andrew would always, 100%, advise that database backups are encrypted. When it comes to encryption of the online databases…there can be a performance penalty to pay…so it really comes down to how sensitive the data is.
Lets have a look at encrypting data within PostgreSQL.
<br>
# Encrypting a column in PostgreSQL
What were going to do here is create a table that has a column that will contain sensitive data. Well import some data as we have done in the previous posts and then encrypt the sensitive column.
Run a container from the custom image: -
docker run -d
--publish 5432:5432
--env POSTGRES_PASSWORD=Testing1122
--name demo-container
ghcr.io/dbafromthecold/demo-postgres:latest
Connect to the dvdrental database in pgAdmin (server is *localhost* and password is *Testing1122*)
Install the pgcrypto extension: -
CREATE EXTENSION pgcrypto;
For more information on the pgcrypto extension: -
https://www.postgresql.org/docs/current/pgcrypto.html
Now create a test table: -
CREATE TABLE test_table (
id smallint,
first_name VARCHAR(50),
last_name VARCHAR(50),
dob DATE,
email VARCHAR(255),
passwd VARCHAR(255),
CONSTRAINT test_table_pkey PRIMARY KEY (id)
)
And import the sample data (included in the container image): -
COPY test_table(id,first_name, last_name, dob, email)
FROM '/dvdrental/test_data.csv'
DELIMITER ','
CSV HEADER;
Now were going to use pgp_sym_encrypt to add an encrypted password to the table for both entries: -
UPDATE test_table
SET passwd = (pgp_sym_encrypt('Testing1122', ENCRYPTIONPWD))
WHERE first_name = 'Andrew';
UPDATE test_table
SET passwd = (pgp_sym_encrypt('Testing3344', ENCRYPTIONPWD))
WHERE first_name = 'Taylor';
Note - here we are using a password to encrypt the data. There are many more options to encrypt data within PostgreSQL..see here for more information: -
https://www.postgresql.org/docs/current/encryption-options.html
Now if we try to SELECT as usual from the table: -
SELECT first_name, last_name, passwd FROM test_table
We can only see the encrypted values: -
![](images/day68-6.png)
In order to view the encrypted data, we have to use pgp_sym_decrypt and the key that we set earlier: -
SELECT first_name, last_name, pgp_sym_decrypt(passwd::bytea, 'ENCRYPTIONPWD') FROM test_table
![](images/day68-7.png)
So if we have sensitive data within our database, this is one method of encrypting it so that it can only be accessed with the correct password.
Join us tomorrow for the final post in the database series of 90DaysOfDevOps where we'll be talking about monitoring and troubleshooting.
Thanks for reading!

View File

@ -0,0 +1,164 @@
# Monitoring and troubleshooting database issues
Hello and welcome to the seventh and final post in the database part of the 90 Days of DevOps blog series! Today well be talking about monitoring and troubleshooting database issues.
Things can, and do, go wrong when looking after database servers and when they do its the job of the Database Administrator to firstly, get the databases back online, and THEN investigate the cause of the issue.
The number one priority is to get the databases back online and in a healthy state.
Once that has been achieved then the root cause of the issue can be investigated and once uncovered, fixes can be recommended.
There are many reasons that a server can run into issues…too many to list here! But they mainly fall into the following categories…issues with the hardware, the underlying OS, the database engine, and transactions (queries) hitting the databases.
It may not just be one factor causing issues on the server there may be many! The “death by 1000 cuts” scenario could be the cause, multiple small issues (typically frequently run queries that are experiencing a performance degradation) which overall result in the server going down or becoming so overloaded that everything grinds to a halt.
<br>
# Monitoring
The first step in troubleshooting database issues comes not when an actual issue has happened but when the database servers are operating normally.
We dont want to be constantly reacting to (aka firefighting) issues…we want to be proactive and anticipate issues before they happen.
DBAs are trained to think about how servers will fail and how the systems that they look after will react when they encounter failures. Thinking about how the systems will fail plays a huge role when designing high availability and disaster recovery strategies.
However once in place, HA strategies are not infallible. There could be a misconfiguration that prevents the solution in place not reacting in the expected way. This means that HA (and DR!) strategies need to be regularly tested to ensure that they work as expected when they are needed!
We dont want to be troubleshooting a failed HA solution at the same time as having to deal with the issue that caused the HA solution to kick in in the first place! (Andrew - we really dont as these things never happen at 4pm on a Tuesday..for some reason it always seems to be 2am on a weekend! 🙂 )
In order to effectively anticipate issues before they happen we need to be monitoring the servers that we look after and have some alerting in place.
There are 100s (if not 1000s) of tools out there that we can use to monitor the servers we manage. There are paid options that come with support and require little configuration to set up and there are others that are free but will require more configuration and have no support.
Its up to us to decide which monitoring tool we go for based on budget, availability (do we have time to spend on configuration?), and skillset (do we have the skills to config and maintain the tool?).
Once we have the tool(s) in place we then point at our serves and make sure that we are monitoring (as an example): -
- CPU Usage
- Memory Usage
- Disk Usage
- Network throughput
- Transactions per second
- Database size
Note - This is not a definitive list of things to monitor, there are a tonne of other metrics to collect based on what system(s) are running on the server.
Having monitoring in place means that we can see what the “normal” state of the servers is and if anything changes, we can pin down the exact time that took place which is invaluable when investigating.
Certain tools can be hooked into other systems for example, in a previous role, Andrews monitoring tool was hooked into the deployment system. So when something was deployed to the database servers, there was a notification on the servers monitoring page.
So, if say the CPU on a particular server skyrocketed up to 100% usage, the DBAs could not only see when this occurred but what was deployed to that server around that time. Incredibly helpful when troubleshooting.
One thing to consider when setting up monitoring, is when do we want to be alerted? Do we want to be alerted after an issue has occurred or do we want to be alerted before?
For example, something is consuming more disk space than normal on a server and the disk is close to becoming completely full. Do we want to be alerted when the disk is full or when the disk is close to becoming full?
Now that seems like an obvious question but its slightly more tricky to get right than you would think. If set up incorrectly the monitoring tool will start outputting alerts like crazy and lead to what is known as “alert fatigue”.
Alert fatigue is when the DBAs are sent so many alerts that they start to ignore them. The monitoring tool is incorrectly configured and is sending out alerts that do not require immediate action so the DBAs take no steps to clear them.
This can lead to actual alerts requiring immediate action to be ignored and then lead to servers going down.
To prevent this alerts should only be sent to the DBAs that require immediate action. Now different alert levels can be set in most tools but again we need to be careful, no-one wants to look at a system with 1000s of “warnings”.
So a good monitoring tool is 100% necessary to prevent DBAs from spending their lives firefighting issues on the servers that they maintain.
<br>
# Log collection
Of course with all the best monitoring tools in the world, and proactive DBAs working to prevent issues, things can still go wrong and when they do the root cause needs to be investigated.
This generally means trawling through various logs to uncover the root cause.
Every database system will have an error log that can be used to investigate issues…this along with the logs of the underlying operating system are the first places to look when troubleshooting an issue (thats been resolved! :-) ).
However, having to go onto a server to retrieve the logs is not the best way to investigate issues. Sure, if only one server has issues then its not so bad but what if more than one server was affected?
Do we really want to be remoting to each individual server to look at their logs?
What we need is a central location where logs can be collected from all the servers so that they can be aggregated and analysed.
This could be the same tool as our monitoring tool but it could be separate. Whats important is that we have somewhere that we collect the logs from our servers and they are then presented in an easily searchable format.
We can also place alerts on the logs so that if a known error occurs we can immediately investigate.
Lets have a look at the PostgreSQL logs. Spin up a container: -
docker run -d \
--publish 5432:5432 \
--env POSTGRES_PASSWORD=Testing1122 \
--name demo-container \
ghcr.io/dbafromthecold/demo-postgres
We need to update the postgresql.conf file to write out to a log, so jump into the container: -
docker exec -it -u postgres demo-container bash
Open the file: -
vim $PGDATA/postgresql.conf
And add the following lines: -
logging_collector = on
log_directory = log
log_filename = postgresql-%Y-%m-%d_%H%M%S.log
Exit the container and restart: -
docker restart container demo-container
Connect in pgAdmin (server is *localhost* and password is *Testing1122*). Open a new query window and run the following: -
SELECT 1/0
This will generate an error: -
![](images/day69-1.png)
OK so we have a query hitting our database that is failing. Were asked to investigate so the first place to start would be the logs.
Jump back into the container: -
docker exec -it -u postgres demo-container bash
And navigate to the log file: -
cd $PGDATA/log
Then view the file: -
cat postgresql-2023-02-24_110854.log
![](images/day69-2.png)
And theres our error! Weve configured our instance of PostgreSQL to log errors to a file which could then be collected and stored in a central location so if we had this issue in a production environment, we would not need to go onto the server to investigate.
So weve looked at monitoring our servers and collecting logs of errors, but what about query performance?
<br>
# Query performance
The far most common issue that DBAs are asked to investigate is poorly performing queries.
Now as mentioned in part 5 of this series, query performance tuning is a massive part of working with databases. People have made (and do make) whole careers out of this area! Were not going to cover everything in one blog post so we will just briefly highlight the main areas to look at.
A proactive approach is needed here in order to prevent query performance degrading. The main areas to look in order to maintain query performance are query structure, indexes and statistics.
Is the query structure in a way to optimally retrieve data from the tables in the database? If not, how can it be rewritten to improve performance? (this is a massive area btw, one that takes considerable knowledge and skill).
When it comes to indexing, do the queries hitting the database have indexes to support them? If indexes are there, are they the most optimal? Have they become bloated (for example, containing empty pages due to data being deleted)?
In order to prevent issues with indexes, a maintenance schedule should be implemented (rebuilding on a regular basis for example).
Its the same with statistics. Statistics in databases can be automatically updated but sometimes (say after a large data insert) they can become out of date resulting in bad plans being generated for queries. In this case DBAs could implement a maintenance schedule for the statistics as well.
And again the same with monitoring and log collection, there are a whole host of tools out there that can be used to track queries. These are incredibly useful as it gives the ability to see how a querys performance has changed over a period of time.
Caution does need to be taken with some of these tools as they can have a negative effect on performance. Tracking every single query and collecting information on them can be a rather intensive operation!
So having the correct monitoring, log collection, and query tracking tools are vital when it comes to not only preventing issues from arising but allowing for quick resolution when they do occur.
And thats it for the database part of the 90DaysOfDevOps blog series. We hope this has been useful…thanks for reading!

BIN
2023/images/day63-1.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 43 KiB

BIN
2023/images/day64-1.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 19 KiB

BIN
2023/images/day65-1.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 29 KiB

BIN
2023/images/day65-2.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 81 KiB

BIN
2023/images/day65-3.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 24 KiB

BIN
2023/images/day65-4.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 98 KiB

BIN
2023/images/day65-5.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 53 KiB

BIN
2023/images/day65-6.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 25 KiB

BIN
2023/images/day65-7.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 109 KiB

BIN
2023/images/day65-8.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 27 KiB

BIN
2023/images/day66-1.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 28 KiB

BIN
2023/images/day66-2.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 26 KiB

BIN
2023/images/day67-1.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 28 KiB

BIN
2023/images/day67-2.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 13 KiB

BIN
2023/images/day67-3.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 35 KiB

BIN
2023/images/day67-4.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 30 KiB

BIN
2023/images/day67-5.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 35 KiB

BIN
2023/images/day68-1.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 14 KiB

BIN
2023/images/day68-2.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 15 KiB

BIN
2023/images/day68-3.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 27 KiB

BIN
2023/images/day68-4.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 60 KiB

BIN
2023/images/day68-5.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 9.6 KiB

BIN
2023/images/day68-6.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 32 KiB

BIN
2023/images/day68-7.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 24 KiB

BIN
2023/images/day69-1.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 10 KiB

BIN
2023/images/day69-2.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 94 KiB