Merge branch 'main' of github.com:vntechies/90DaysOfDevOps
@ -23,11 +23,11 @@ date: '2022-04-17T10:12:40Z'
|
||||
|
||||
위의 블로그를 읽고 저의 학습 여정의 수준이 매우 높을 것이라고 짐작하시는 분들을 위해 말씀드리면, 저는 그 분야들에 대해 전문가는 아닙니다만 유료이건 무료이건 어떤 형태로든 자료를 공유하고 싶었습니다. 우리는 모두 서로 다른 환경에 있을 것이니 각자에게 맞는 선택을 하시길 바랍니다.
|
||||
|
||||
앞으로 90일간 저는 기본적인 영역들에 대해 다루고 자료들을 문서화 하려고 합니다. 커뮤니티의 적극적인 참여도 바라고 있습니다. 많은 분의 학습 여정과 자료의 공유를 통해 공개적으로 함께 도와가며 서로 배우길 바랍니다.
|
||||
앞으로 90일간 저는 기본적인 영역들에 대해 다루고 자료들을 문서화하려고 합니다. 커뮤니티의 적극적인 참여도 바라고 있습니다. 많은 분의 학습 여정과 자료의 공유를 통해 공개적으로 함께 도와가며 서로 배우길 바랍니다.
|
||||
|
||||
프로젝트 리포지토리의 readme에 12주에 6일을 더한 분량의 섹션 별로 분할을 해두었습니다. 첫 6일간은 특정 영역에 뛰어들기 전에 데브옵스에 대한 전반적인 기본 지식들에 대해 학습할 것입니다. 반복해서 말씀드리면 저의 자료가 완벽하지 않습니다. 따라서 이 자료들이 유용하게 활용될 수 있도록 커뮤니티의 도움을 바랍니다.
|
||||
|
||||
지금 이 순간에 공유하고 싶은 자료가 하나 더 있습니다. 모두가 꼼꼼히 살펴야하고, 스스로에 대한, 자신의 관심과 현재 위치를 마인드맵으로 그려야할, 그것은
|
||||
지금 이 순간에 공유하고 싶은 자료가 하나 더 있습니다. 모두가 꼼꼼히 살펴야 하고, 스스로에 대한, 자신의 관심과 현재 위치를 마인드맵으로 그려야 할, 그것은
|
||||
|
||||
[DevOps Roadmap](https://roadmap.sh/devops)
|
||||
|
||||
@ -35,30 +35,31 @@ date: '2022-04-17T10:12:40Z'
|
||||
|
||||
## 첫 단계 - 데브옵스란?
|
||||
|
||||
소개드리고 싶은 유튜브 비디오나 블로그 글이 너무 많습니다. 하지만 우리는 90일의 도전을 시작했고 매일 한시간씩 새로운 것을 배우거나 데브옵스에 관해 배우기로 했으므로, "DevOps란 무엇인가"라는 높은 수준의 정보를 먼저 얻는 것이 좋다고 생각합니다.
|
||||
소개드리고 싶은 유튜브 비디오나 블로그 글이 너무 많습니다. 하지만 우리는 90일의 도전을 시작했고 매일 한 시간씩 새로운 것을 배우거나 데브옵스에 관해 배우기로 했으므로, "DevOps란 무엇인가"라는 높은 수준의 정보를 먼저 얻는 것이 좋다고 생각합니다.
|
||||
|
||||
우선, 데브옵스는 도구가 아닙니다. 살 수 있는 것도 아니고, software SKU나 깃허브 레포지토리에서 다운로드 받을 수 있는 오픈 소스도 아닙니다. 또한 프로그래밍 언어도 아니고, 괴상한 흑마술도 아닙니다.
|
||||
|
||||
데브옵스란 소프트웨어 개발에서 좀 더 현명하게 일하는 방법을 말합니다. - 잠깐... 혹시 소프트웨어 개발자가 아니라면 이 학습 과정을 중단해야 할까요??? 아닙니다.. 데브옵스란 소프트웨어 개발과 운영의 통합을 의미하기 때문입니다. 위에서 언급했듯이 저는 일반적으로 운영에 속하는 VM(가상머신)과 관련된 쪽에 있었지만, 커뮤니티에는 다양한 배경을 가진 사람들이 있습니다. 그리고 개인, 개발자, 운영자 그리고 QA 엔지니어 모두는 DevOps를 더 잘 이해함으로써 모범사례에 관해 동등하게 배울 수 있습니다.
|
||||
|
||||
데브옵스는 이 목표를 달성하기 위한 일련의 관행입니다: 제품이 초기 아이디어 단계부터 최종 사용자, 내부 팀 또는 고객 등 모든 사용자에게 실제 운영 서비스로 전달되기 까지의 시간을 단축하는 것.
|
||||
데브옵스는 이 목표를 달성하기 위한 일련의 관행입니다: 제품이 초기 아이디어 단계부터 최종 사용자, 내부 팀 또는 고객 등 모든 사용자에게 실제 운영 서비스로 전달되기까지의 시간을 단축하는 것.
|
||||
|
||||
첫 주에 다룰 또 다른 분야는 **애자일 방법론** 에 관한 것입니다. 애플리케이션을 지속적으로 전달(Continuous Delivery)하기 위해 데브옵스와 애자일은 주로 함께 다루어집니다.
|
||||
|
||||
개략적으로 말해 데브옵스적 사고방식이나 문화는 길고 몇년이 걸릴 수 있는 소프트웨어 배포 프로세스를 더 작고, 자주 배포하는 방식으로 시간을 단축시키는 것입니다. 추가로 이해해야할 또 다른 핵심 포인트는 위에 언급한 개발, 운영, QA 팀간의 사일로를 무너트리는 것은 데브옵스 엔지니어의 책임입니다.
|
||||
개략적으로 말해 데브옵스적 사고방식이나 문화는 길고 몇 년이 걸릴 수 있는 소프트웨어 배포 프로세스를 더 작고, 자주 배포하는 방식으로 시간을 단축시키는 것입니다. 추가로 이해해야 할 또 다른 핵심 포인트는 위에 언급한 개발, 운영, QA 팀 간의 사일로를 무너트리는 것은 데브옵스 엔지니어의 책임입니다.
|
||||
|
||||
데브옵스의 관점에서 보면 **개발, 테스트, 배포**는 모두 데브옵스 팀과 함께 해야하기 때문입니다.
|
||||
데브옵스의 관점에서 보면 **개발, 테스트, 배포**는 모두 데브옵스 팀과 함께해야 하기 때문입니다.
|
||||
|
||||
최종적으로 이런 것을 효과적이고 효율적으로 하기 위해서는 **자동화**를 최대한 활용해야 합니다.
|
||||
|
||||
## 자료
|
||||
|
||||
이곳을 학습 도구로 활용하기 위해 이 readme 파일에 추가적으로 자료를 덪붙이는 것에 대해 항상 열려있습니다.
|
||||
이곳을 학습 도구로 활용하기 위해 이 readme 파일에 추가적으로 자료를 덧붙이는 것에 대해 항상 열려있습니다.
|
||||
|
||||
그리고 아래 동영상들을 꼭 보시기 바랍니다. 또한 위에 설명드린 내용에서 많은 인사이트를 얻었으면 합니다.
|
||||
|
||||
- [DevOps in 5 Minutes](https://www.youtube.com/watch?v=Xrgk023l4lI)
|
||||
- [What is DevOps? Easy Way](https://www.youtube.com/watch?v=_Gpe1Zn-1fE&t=43s)
|
||||
- [DevOps roadmap 2022 | Success Roadmap 2022](https://www.youtube.com/watch?v=7l_n97Mt0ko)
|
||||
- [From Zero to DevOps Engineer - DevOps Roadmap for YOUR specific background](https://www.youtube.com/watch?v=G_nVMUtaqCk)
|
||||
|
||||
여기까지 읽었다면 나에게 필요한 내용인지 아닌지 알 수 있을 것입니다. [Day 2](day02.md)
|
||||
|
@ -13,7 +13,7 @@ date: '2022-04-17T21:15:34Z'
|
||||
|
||||
부디, 여러분이 자료를 찾고 [Day1 of #90DaysOfDevOps](day01.md) 페이지에 글을 올리면서 함께 참여하기를 바랍니다.
|
||||
|
||||
첫 번째 글에서 짦게 다루었습니다만, 이제 더 깊이 있는 개념 그리고 애플리케이션을 만드는 것에는 두가지 주요 파트가 있다는 것에 대해 이해할 필요가 있습니다. 소프트웨어 개발자들이 애플리케이션을 작성하고 테스트하는 **개발** 파트와 애플리케이션을 서버에 배포하고 유지하는 **운영** 파트 입니다.
|
||||
첫 번째 글에서 짧게 다루었습니다만, 이제 더 깊이 있는 개념 그리고 애플리케이션을 만드는 것에는 두 가지 주요 파트가 있다는 것에 대해 이해할 필요가 있습니다. 소프트웨어 개발자들이 애플리케이션을 작성하고 테스트하는 **개발** 파트와 애플리케이션을 서버에 배포하고 유지하는 **운영** 파트 입니다.
|
||||
|
||||
## 데브옵스는 그 둘을 연결합니다.
|
||||
|
||||
@ -39,27 +39,27 @@ To get to grips with DevOps or the tasks which a DevOps engineer would be carryi
|
||||
|
||||
## 이것도 알고, 저것도 알고
|
||||
|
||||
네트워크 또는 인프라의 스페셜리스트가 될 필요는 없습니다. 서버를 올리고, 실행시키고 상호간 통신이 가능하도록 구성하는 방법에 대한 지식만 있으면됩니다. 마찬가지로 개발자가 될 필요는 없습니다. 프로그래밍 언어에 대한 기본적인 지식만 있으면 됩니다. 하지만 어느 한 분야의 전문가로써 데브옵스 업무에 참여할 수 있고, 이럴 경우 다른 분야에 적응하기 위한 매우 좋은 기반이 될 것입니다.
|
||||
네트워크 또는 인프라의 스페셜리스트가 될 필요는 없습니다. 서버를 올리고, 실행시키고 상호 간 통신이 가능하도록 구성하는 방법에 대한 지식만 있으면 됩니다. 마찬가지로 개발자가 될 필요는 없습니다. 프로그래밍 언어에 대한 기본적인 지식만 있으면 됩니다. 하지만 어느 한 분야의 전문가로서 데브옵스 업무에 참여할 수 있고, 이럴 경우 다른 분야에 적응하기 위한 매우 좋은 기반이 될 것입니다.
|
||||
|
||||
또한 서버나 애플리케이션의 관리를 매일 인계받지 못할 수 도 있습니다.
|
||||
또한 서버나 애플리케이션의 관리를 매일 인계받지 못할 수도 있습니다.
|
||||
|
||||
서버에 대해서만 이야기 했지만, 애플리케이션이 컨테이너로 실행되도록 개발해야 할 수 도 있습니다. 여전히 서버에서 실행하는 것이라곤 하나, 가상화, IaaS (클라우드 인프라 서비스)와 더불어 컨테이너화에 대한 이해도 필요합니다.
|
||||
서버에 대해서만 이야기했지만, 애플리케이션이 컨테이너로 실행되도록 개발해야 할 수도 있습니다. 여전히 서버에서 실행하는 것이라곤 하나, 가상화, IaaS (클라우드 인프라 서비스)와 더불어 컨테이너화에 대한 이해도 필요합니다.
|
||||
|
||||
## 고차원적인 개요
|
||||
|
||||
한쪽에서 우리 개발자들이 애플리케이션을 위핸 시 기능들을 (버그 수정과 더불어) 추가합니다.
|
||||
한쪽에서 우리 개발자들이 애플리케이션을 위한 기능들을 (버그 수정과 더불어) 추가합니다.
|
||||
|
||||
다른 한쪽에서는 실제 애플리케이션이 실행되고 필요 서비스들과 통신하도록 구성 및 관리되고 있는 서버, 인프라 내지는 환경이 있습니다.
|
||||
|
||||
여기서 핵심은 버그 수정 및 새 기능이 추가된 버전을 운영 환경에 적용시켜 최종 사용자에게 제공하도록 하는 것입니다.
|
||||
|
||||
새 애플리케이션 버전을 어떻게 출시하는가? 이것이 데브옵스 엔지니어의 핵심 업무 입니다. 테스트를 포함한 효과적이고 자동화된 방식을 지속적으로 고민해야합니다.
|
||||
새 애플리케이션 버전을 어떻게 출시하는가? 이것이 데브옵스 엔지니어의 핵심 업무입니다. 테스트를 포함한 효과적이고 자동화된 방식을 지속적으로 고민해야 합니다.
|
||||
|
||||
여기서 오늘 학습을 끝내도록 하겠습니다. 부디 유용했기를 바랍니다. 앞으로 DevOps의 다양한 영역들과, 다양한 도구 및 프로세스들의 사용 이점에 대해서 깊이 있게 다루도록 하겠습니다.
|
||||
|
||||
## 자료
|
||||
|
||||
이곳을 학습 도구로 활용하기 위해 이 readme 파일에 추가적으로 자료를 덪붙이는 것에 대해 항상 열려있습니다.
|
||||
이곳을 학습 도구로 활용하기 위해 이 readme 파일에 추가적으로 자료를 덧붙이는 것에 대해 항상 열려있습니다.
|
||||
|
||||
그리고 아래 동영상들을 꼭 보시기 바랍니다. 또한 위에 설명드린 내용에서 많은 인사이트를 얻었으면 합니다.
|
||||
|
||||
|
97
2022/ko/Days/day03.md
Normal file
@ -0,0 +1,97 @@
|
||||
---
|
||||
title: '#90DaysOfDevOps - Application Focused - Day 3'
|
||||
published: false
|
||||
description: 90DaysOfDevOps - Application Focused
|
||||
tags: 'devops, 90daysofdevops, learning'
|
||||
cover_image: null
|
||||
canonical_url: null
|
||||
id: 1048825
|
||||
---
|
||||
|
||||
## 데브옵스 수명 주기 - 애플리케이션 중심
|
||||
|
||||
앞으로 몇 주 동안 계속 진행하면서 지속적 개발, 테스트, 배포, 모니터에 대해 100% 반복해서 접하게 될 것입니다. DevOps 엔지니어 역할로 향하고 있다면 반복성이 익숙해질 것이지만 매번 지속적으로 향상시키는 것도 흥미를 유지하는 또 다른 요소입니다.
|
||||
|
||||
이번 시간에는 애플리케이션을 처음부터 끝까지 살펴본 다음 다시 반복하듯 되돌아보는 고차원의 시각으로 살펴보겠습니다.
|
||||
|
||||
### Development (개발)
|
||||
|
||||
애플리케이션의 새로운 예를 들어 보겠습니다. 먼저 아무것도 만들어지지 않은 상태에서 개발자는 고객 또는 최종 사용자와 요구 사항을 논의하고 애플리케이션에 대한 일종의 계획이나 요구 사항을 마련해야 합니다. 그런 다음 요구 사항을 바탕으로 새로운 애플리케이션을 만들어야 합니다.
|
||||
|
||||
이 단계의 도구와 관련해서는 애플리케이션을 작성하는 데 사용할 IDE와 프로그래밍 언어를 선택하는 것 외에는 실제 요구 사항이 없습니다.
|
||||
|
||||
데브옵스 엔지니어로서 이 계획을 만들거나 최종 사용자를 위해 애플리케이션을 코딩하는 것은 여러분이 아니라 숙련된 개발자가 할 일이라는 점을 기억하세요.
|
||||
|
||||
그러나 애플리케이션에 대한 최상의 인프라 결정을 내릴 수 있도록 일부 코드를 읽을 수 있는 것도 나쁘지 않을 것입니다.
|
||||
|
||||
앞서 이 애플리케이션은 어떤 언어로든 작성할 수 있다고 언급했습니다. 중요한 것은 버전 관리 시스템을 사용하여 유지 관리해야 한다는 것인데, 이 부분은 나중에 자세히 다룰 것이며 특히 **Git**에 대해 자세히 살펴볼 것입니다.
|
||||
|
||||
또한 이 프로젝트에서 한 명의 개발자가 작업하는 것이 아닐 수도 있지만, 이 경우에도 모범 사례에서는 코드를 저장하고 협업하기 위한 코드 저장소가 필요하며, 이는 비공개 또는 공개일 수 있고 호스팅되거나 비공개로 배포될 수 있으며 일반적으로 **GitHub 또는 GitLab**과 같은 코드 저장소가 코드 저장소로 사용되는 것을 듣게 될 것입니다. 이에 대해서는 나중에 **Git** 섹션에서 다시 다루겠습니다.
|
||||
|
||||
### Testing (테스팅)
|
||||
|
||||
이 단계에서는 요구 사항이 있고 애플리케이션이 개발되고 있습니다. 하지만 우리가 사용할 수 있는 모든 다양한 환경, 특히 선택한 프로그래밍 언어에서 코드를 테스트하고 있는지 확인해야 합니다.
|
||||
|
||||
이 단계에서 QA는 버그를 테스트할 수 있으며, 테스트 환경을 시뮬레이션하는 데 컨테이너를 사용하는 경우가 많아져 전반적으로 물리적 또는 클라우드 인프라의 비용 오버헤드를 개선할 수 있습니다.
|
||||
|
||||
이 단계는 또한 다음 영역인 지속적 통합의 일부로 자동화될 가능성이 높습니다.
|
||||
|
||||
이 테스트를 자동화할 수 있다는 것은 수십, 수백, 심지어 수천 명의 QA 엔지니어가 이 작업을 수동으로 수행해야 하는 것과 비교하면 그 자체로 의미가 있으며, 이러한 엔지니어는 스택 내에서 다른 작업에 집중하여 워터폴 방법론을 사용하는 대부분의 기존 소프트웨어 릴리스에서 지체되는 경향이 있는 버그 및 소프트웨어 테스트 대신 더 빠르게 움직이고 더 많은 기능을 개발할 수 있습니다.
|
||||
|
||||
### Integration (통합)
|
||||
|
||||
매우 중요한 것은 통합이 데브옵스 라이프사이클의 중간에 있다는 것입니다. 개발자가 소스 코드에 변경 사항을 더 자주 커밋해야 하는 practice(관행)입니다. 이는 매일 또는 매주 단위로 이루어질 수 있습니다.
|
||||
|
||||
커밋할 때마다 애플리케이션은 자동화된 테스트 단계를 거치게 되며, 이를 통해 다음 단계로 넘어가기 전에 문제나 버그를 조기에 발견할 수 있습니다.
|
||||
|
||||
이제 이 단계에서 "하지만 우리는 애플리케이션을 만들지 않고 소프트웨어 공급업체에서 기성품을 구입합니다."라고 말할 수 있습니다. 많은 회사가 이렇게 하고 있고 앞으로도 계속 그렇게 할 것이며 위의 3단계에 집중하는 것은 소프트웨어 공급업체가 될 것이므로 걱정하지 마세요. 하지만 마지막 단계를 채택하면 기성품 배포를 더 빠르고 효율적으로 배포할 수 있으므로 여전히 채택하고 싶을 수도 있습니다.
|
||||
|
||||
오늘 당장 상용 소프트웨어를 구매할 수도 있지만 내일이나 또는... 다음 직장에서 사용할 수도 있기 때문에 위의 지식을 갖추는 것만으로도 매우 중요하다고 말씀드리고 싶습니다.
|
||||
|
||||
### Deployment (배포)
|
||||
|
||||
Ok so we have our application built and tested against the requirements of our end user and we now need to go ahead and deploy this application into production for our end users to consume.
|
||||
|
||||
This is the stage where the code is deployed to the production servers, now this is where things get extremely interesting and it is where the rest of our 86 days dives deeper into these areas. Because different applications require different possibly hardware or configurations. This is where **Application Configuration Management** and **Infrastructure as Code** could play a key part in your DevOps lifecycle. It might be that your application is **Containerised** but also available to run on a virtual machine. This then also leads us onto platforms like **Kubernetes** which would be orchestrating those containers and making sure you have the desired state available to your end users.
|
||||
|
||||
Of these bold topics, we will go into more detail over the next few weeks to get a better foundational knowledge of what they are and when to use them.
|
||||
|
||||
이제 최종 사용자의 요구 사항에 따라 애플리케이션을 빌드하고 테스트를 마쳤으므로 이제 최종 사용자가 사용할 수 있도록 이 애플리케이션을 프로덕션에 배포해야 합니다.
|
||||
|
||||
이 단계는 코드를 프로덕션 서버에 배포하는 단계로, 이제부터 매우 흥미로운 일이 벌어지며 나머지 86일 동안 이러한 영역에 대해 더 자세히 알아볼 것입니다. 애플리케이션마다 필요한 하드웨어나 구성이 다르기 때문입니다. 바로 이 부분에서 **Application Configuration Management(애플리케이션 구성 관리)**와 **Infrastructure as Code(코드형 인프라)**가 데브옵스 라이프사이클에서 중요한 역할을 할 수 있습니다. 애플리케이션이 **Containerised(컨테이너화)**되어 있지만 가상 머신에서 실행할 수 있는 경우도 있을 수 있습니다. 그런 다음 이러한 컨테이너를 오케스트레이션하고 최종 사용자가 원하는 상태를 사용할 수 있도록 하는 **Kubernetes**와 같은 플랫폼으로 이어집니다.
|
||||
|
||||
이 대담한 주제 중 앞으로 몇 주에 걸쳐 더 자세히 살펴보면서 컨테이너가 무엇이고 언제 사용하는지에 대한 기초 지식을 쌓을 것입니다.
|
||||
|
||||
### Monitoring (관제)
|
||||
|
||||
새로운 기능으로 지속적으로 업데이트하고 있는 애플리케이션이 있으며, 테스트 과정에서 문제점이 발견되지 않는지 확인하고 있습니다. 필요한 구성과 성능을 지속적으로 유지할 수 있는 애플리케이션이 우리 환경에서 실행되고 있습니다.
|
||||
|
||||
하지만 이제 최종 사용자가 필요한 경험을 얻고 있는지 확인해야 합니다. 이 단계에서는 애플리케이션 성능을 지속적으로 모니터링하여 개발자가 향후 릴리스에서 애플리케이션을 개선하여 최종 사용자에게 더 나은 서비스를 제공할 수 있도록 더 나은 결정을 내릴 수 있도록 해야 합니다.
|
||||
|
||||
또한 이 섹션에서는 구현된 기능에 대한 피드백을 수집하고 최종 사용자가 어떻게 개선하기를 원하는지에 대한 피드백을 수집할 것입니다.
|
||||
|
||||
안정성은 여기서도 핵심 요소이며, 결국에는 애플리케이션이 필요할 때 항상 사용할 수 있기를 원합니다. 이는 지속적으로 모니터링해야 하는 다른 **observability, security and data management(관찰 가능성, 보안 및 데이터 관리)** 영역으로 이어지며, 피드백을 통해 애플리케이션을 지속적으로 개선, 업데이트 및 릴리스하는 데 항상 사용할 수 있습니다.
|
||||
|
||||
특히 [@\_ediri](https://twitter.com/_ediri) 커뮤니티의 일부 의견은 이러한 지속적인 프로세스의 일부로 FinOps 팀도 참여해야 한다고 언급했습니다. 앱과 데이터는 어딘가에서 실행되고 저장되므로 리소스 관점에서 상황이 변경될 경우 비용이 클라우드 요금에 큰 재정적 고통을 주지 않도록 지속적으로 모니터링해야 합니다.
|
||||
|
||||
위에서 언급한 "DevOps 엔지니어"에 대해서도 언급할 때가 되었다고 생각합니다. 많은 사람들이 DevOps 엔지니어라는 직책을 가지고 있지만, 이것은 DevOps 프로세스를 포지셔닝하는 이상적인 방법은 아닙니다. 제 말은 커뮤니티의 다른 사람들과 이야기할 때 데브옵스 엔지니어라는 직책이 누구의 목표가 되어서는 안 된다는 것입니다. 실제로 어떤 직책이든 여기에서 설명한 데브옵스 프로세스와 문화를 채택해야 하기 때문입니다. 데브옵스는 클라우드 네이티브 엔지니어/아키텍트, 가상화 관리자, 클라우드 아키텍트/엔지니어, 인프라 관리자와 같은 다양한 직책에서 사용되어야 합니다. 몇 가지 예를 들었지만, 위에서 DevOps 엔지니어를 사용한 이유는 위의 모든 직책에서 사용하는 프로세스의 범위 등을 강조하기 위해서입니다.
|
||||
|
||||
## Resources
|
||||
|
||||
I am always open to adding additional resources to these readme files as it is here as a learning tool.
|
||||
|
||||
My advice is to watch all of the below and hopefully you also picked something up from the text and explanations above.
|
||||
|
||||
학습 도구로서 이 Readme 파일에 추가 리소스를 추가하는 것은 언제나 열려 있습니다.
|
||||
|
||||
아래의 내용을 모두 보시고 위의 텍스트와 설명에서 많은 것들을 얻으셨으면 좋겠습니다.
|
||||
|
||||
- [Continuous Development](https://www.youtube.com/watch?v=UnjwVYAN7Ns) I will also add that this is focused on manufacturing but the lean culture can be closely followed with DevOps.
|
||||
- [Continuous Testing - IBM YouTube](https://www.youtube.com/watch?v=RYQbmjLgubM)
|
||||
- [Continuous Integration - IBM YouTube](https://www.youtube.com/watch?v=1er2cjUq1UI)
|
||||
- [Continuous Monitoring](https://www.youtube.com/watch?v=Zu53QQuYqJ0)
|
||||
- [The Remote Flow](https://www.notion.so/The-Remote-Flow-d90982e77a144f4f990c135f115f41c6)
|
||||
- [FinOps Foundation - What is FinOps](https://www.finops.org/introduction/what-is-finops/)
|
||||
- [**NOT FREE** The Phoenix Project: A Novel About IT, DevOps, and Helping Your Business Win](https://www.amazon.com/Phoenix-Project-DevOps-Helping-Business/dp/1942788290/)
|
||||
|
||||
여기까지 왔다면 이곳이 자신이 원하는 곳인지 아닌지 알 수 있을 것입니다. 다음에 뵙겠습니다. [Day 4](day04.md).
|
99
2022/ko/Days/day04.md
Normal file
@ -0,0 +1,99 @@
|
||||
---
|
||||
title: '#90DaysOfDevOps - DevOps & Agile - Day 4'
|
||||
published: false
|
||||
description: 90DaysOfDevOps - DevOps & Agile
|
||||
tags: 'devops, 90daysofdevops, learning'
|
||||
cover_image: null
|
||||
canonical_url: null
|
||||
id: 1048700
|
||||
---
|
||||
|
||||
## DevOps & Agile (데브옵스 & 애자일)
|
||||
|
||||
데브옵스와 애자일의 차이점을 알고 계신가요? 데브옵스와 애자일은 독립적인 개념으로 형성되었습니다. 하지만 이제 이 두 용어는 융합되고 있습니다.
|
||||
|
||||
이 글에서는 애자일과 데브옵스의 중요한 차이점을 살펴보고 이 둘이 긴밀하게 연결되어 있는 이유를 알아보겠습니다.
|
||||
|
||||
이 분야를 배우면서 제가 본 공통적인 관점, 즉 목표와 프로세스가 비슷하지만 데브옵스와 애자일에 대해 조금 더 이해하는 것이 시작하기에 좋은 출발점이라고 생각합니다. 이 섹션에서는 이에 대해 간략하게 정리해 보려고 합니다.
|
||||
|
||||
정의부터 시작하겠습니다.
|
||||
|
||||
### Agile Development
|
||||
|
||||
애자일은 제품의 큰 결과물을 한 번에 출시하기보다는 작은 결과물을 더 빠르게 제공하는 데 중점을 두는 접근 방식으로, 소프트웨어는 반복적으로 개발됩니다. 팀은 매주 또는 매달 점진적인 업데이트를 통해 새 버전을 출시합니다. 애자일의 최종 목표는 최종 사용자에게 최적의 경험을 제공하는 것입니다.
|
||||
|
||||
### DevOps
|
||||
|
||||
지난 며칠 동안 데브옵스의 최종 목표를 설명하는 몇 가지 다른 방법으로 이 문제를 다뤄왔습니다. 데브옵스는 일반적으로 소프트웨어 개발자와 운영 전문가 간의 협력을 기반으로
|
||||
및 소프트웨어 개발자와 운영 전문가 간의 협력을 기반으로 하는 배포 관행을 설명합니다. 데브옵스의 주요 이점은 간소화된 개발 프로세스를 제공하고 잘못된 커뮤니케이션을 최소화하는 것입니다.
|
||||
|
||||
## 애자일과 데브옵스의 차이점은 무엇인가?
|
||||
|
||||
차이점은 주로 선입견에 있습니다. 애자일과 데브옵스는 서로 다른 선입견을 가지고 있지만 서로를 돕고 있습니다. 애자일은 짧은 반복을 원하는데, 이는 데브옵스가 제공하는 자동화를 통해서만 가능합니다. 애자일은 고객이 특정 버전을 사용해보고 신속하게 피드백을 주기를 원하는데, 이는 데브옵스가 새로운 환경을 쉽게 만들 수 있을 때만 가능합니다.
|
||||
|
||||
### Different participants (서로 다른 참여자)
|
||||
|
||||
애자일은 최종 사용자와 개발자 간의 커뮤니케이션을 최적화하는 데 중점을 두는 반면 데브옵스는 개발자와 운영, 팀원을 대상으로 합니다. 애자일은 고객을 향한 외부 지향적인 반면 데브옵스는 일련의 내부 관행이라고 할 수 있습니다.
|
||||
|
||||
### 팀
|
||||
|
||||
애자일은 일반적으로 소프트웨어 개발자와 프로젝트 관리자에게 적용됩니다. 데브옵스 엔지니어의 역량은 제품 주기의 모든 단계에 관여하고 애자일 팀의 일원이므로 개발, QA(품질 보증) 및 운영이 교차하는 지점에 있습니다.
|
||||
|
||||
### 적용된 프레임워크
|
||||
|
||||
애자일에는 유연성과 투명성을 달성하기 위한 다양한 관리 프레임워크가 있습니다: Scrum > Kanban > Lean > Extreme > Crystal > Dynamic > Feature-Driven. 데브옵스는 협업을 통한 개발 접근 방식에 중점을 두지만 구체적인 방법론을 제공하지는 않습니다. 그러나 데브옵스는 코드형 인프라, 코드형 아키텍처, 모니터링, 자가 치유, 엔드투엔드 테스트 자동화와 같은 관행을 장려합니다. 그러나 이것은 그 자체로 프레임워크가 아니라 관행입니다.
|
||||
|
||||
### 피드백
|
||||
|
||||
애자일에서는 피드백의 주요 출처가 최종 사용자인 반면, 데브옵스에서는 이해관계자와 팀 자체의 피드백이 더 높은 우선순위를 갖습니다.
|
||||
|
||||
### 대상 영역
|
||||
|
||||
애자일은 배포 및 유지 관리보다 소프트웨어 개발에 더 중점을 둡니다. 데브옵스는 소프트웨어 개발에도 중점을 두지만 그 가치와 도구는 모니터링, 고가용성, 보안 및 데이터 보호와 같은 배포 및 릴리스 후 단계에도 적용됩니다.
|
||||
|
||||
### 문서
|
||||
|
||||
애자일은 문서화 및 모니터링보다 유연성과 당면한 작업에 우선순위를 둡니다. 반면 데브옵스는 프로젝트 문서를 필수 프로젝트 구성 요소 중 하나로 간주합니다.
|
||||
|
||||
### 위험요소
|
||||
|
||||
애자일 리스크는 방법론의 유연성에서 비롯됩니다. 애자일 프로젝트는 우선순위와 요구사항이 계속 변하기 때문에 예측하거나 평가하기가 어렵습니다.
|
||||
|
||||
데브옵스 위험은 용어에 대한 오해와 적절한 도구의 부재에서 비롯됩니다. 어떤 사람들은 데브옵스를 개발 프로세스의 기본 구조를 바꾸지 못하는 배포 및 지속적 통합을 위한 소프트웨어 모음으로 간주합니다.
|
||||
|
||||
### 사용되는 툴들
|
||||
|
||||
애자일 도구는 경영진의 커뮤니케이션 협업, 메트릭 및 피드백 처리에 중점을 둡니다. 가장 인기 있는 애자일 도구로는 JIRA, Trello, Slack, Zoom, SurveyMonkey 등이 있습니다.
|
||||
|
||||
데브옵스는 팀 커뮤니케이션, 소프트웨어 개발, 배포 및 통합을 위해 Jenkins, GitHub Actions, BitBucket 등과 같은 도구를 사용합니다. 애자일과 데브옵스는 초점과 범위가 약간 다르지만 핵심 가치는 거의 동일하므로 두 가지를 결합할 수 있습니다.
|
||||
|
||||
## 모두 모아본다면... 좋은 선택일까요? 논의가 필요할까요?
|
||||
|
||||
애자일과 데브옵스를 결합하면 다음과 같은 이점을 얻을 수 있습니다:
|
||||
|
||||
- 유연한 관리와 강력한 기술.
|
||||
- 애자일 관행은 데브옵스 팀이 우선순위를 보다 효율적으로 소통하는 데 도움이 됩니다.
|
||||
- 데브옵스 관행을 위해 지불해야 하는 자동화 비용은 신속하고 자주 배포해야 하는 애자일 요구 사항에 따라 정당화됩니다.
|
||||
- 애자일 방식을 채택하는 팀은 협업을 개선하고 팀의 동기를 높이며 직원 이직률을 낮출 수 있습니다.
|
||||
- 결과적으로 제품 품질이 향상됩니다.
|
||||
|
||||
애자일을 사용하면 이전 제품 개발 단계로 돌아가 오류를 수정하고 기술 부채의 누적을 방지할 수 있습니다. 애자일과 데브옵스를 동시에 도입하려면
|
||||
동시에 도입하려면 다음 7단계를 따르세요:
|
||||
|
||||
1. 개발 팀과 운영 팀을 통합합니다.
|
||||
2. 빌드 및 운영 팀을 만들고 모든 개발 및 운영 관련 문제를 전체 DevOps 팀에서 논의합니다.
|
||||
3. 스프린트에 대한 접근 방식을 변경하고 우선순위를 지정하여 개발 작업과 동일한 가치를 지닌 DevOps 작업을 제공하세요. 개발 팀과 운영 팀이 다른 팀의 워크플로와 발생 가능한 문제에 대해 의견을 교환하도록 장려하세요.
|
||||
4. 모든 개발 단계에 QA를 포함하세요.
|
||||
5. 올바른 도구를 선택하세요.
|
||||
6. 가능한 모든 것을 자동화하세요.
|
||||
7. 가시적인 수치 결과물을 사용하여 측정하고 제어하세요.
|
||||
|
||||
어떻게 생각하시나요? 다른 견해가 있으신가요? 개발자, 운영, QA 또는 애자일 및 DevOps에 대해 더 잘 이해하고 이에 대한 의견과 피드백을 전달해 주실 수 있는 분들의 의견을 듣고 싶습니다.
|
||||
|
||||
### Resources
|
||||
|
||||
- [DevOps for Developers – Day in the Life: DevOps Engineer in 2021](https://www.youtube.com/watch?v=2JymM0YoqGA)
|
||||
- [3 Things I wish I knew as a DevOps Engineer](https://www.youtube.com/watch?v=udRNM7YRdY4)
|
||||
- [How to become a DevOps Engineer feat. Shawn Powers](https://www.youtube.com/watch?v=kDQMjAQNvY4)
|
||||
|
||||
여기까지 왔다면 이곳이 자신이 원하는 곳인지 아닌지 알 수 있을 것입니다. 다음에 뵙겠습니다. [Day 5](day05.md).
|
50
2023.md
@ -82,7 +82,7 @@ Or contact us via Twitter, my handle is [@MichaelCade1](https://twitter.com/Mich
|
||||
- [✔️] 🏃 31 > [Runtime network protections and policies](2023/day31.md)
|
||||
- [✔️] 🏃 32 > [Vulnerability and patch management](2023/day32.md)
|
||||
- [✔️] 🏃 33 > [Application runtime and network policies](2023/day33.md)
|
||||
- [] 🏃 34 > [Runtime access control](2023/day34.md)
|
||||
- [✔️] 🏃 34 > [Runtime access control](2023/day34.md)
|
||||
|
||||
### Secrets Management
|
||||
|
||||
@ -96,28 +96,28 @@ Or contact us via Twitter, my handle is [@MichaelCade1](https://twitter.com/Mich
|
||||
|
||||
### Python
|
||||
|
||||
- [] 🐍 42 > [Programming Language: Introduction to Python](2023/day42.md)
|
||||
- [] 🐍 43 > [Python Loops, functions, modules and libraries](2023/day43.md)
|
||||
- [] 🐍 44 > [Data Structures and OOP in Python](2023/day44.md)
|
||||
- [] 🐍 45 > [Debugging, testing and Regular expression](2023/day45.md)
|
||||
- [] 🐍 46 > [Web development in Python](2023/day46.md)
|
||||
- [] 🐍 47 > [Automation with Python](2023/day47.md)
|
||||
- [] 🐍 48 > [](2023/day48.md)
|
||||
- [✔️] 🐍 42 > [Programming Language: Introduction to Python](2023/day42.md)
|
||||
- [✔️] 🐍 43 > [Python Loops, functions, modules and libraries](2023/day43.md)
|
||||
- [✔️] 🐍 44 > [Data Structures and OOP in Python](2023/day44.md)
|
||||
- [✔️] 🐍 45 > [Debugging, testing and Regular expression](2023/day45.md)
|
||||
- [✔️] 🐍 46 > [Web development in Python](2023/day46.md)
|
||||
- [✔️] 🐍 47 > [Automation with Python](2023/day47.md)
|
||||
- [✔️] 🐍 48 > [Let's build an App in Python](2023/day48.md)
|
||||
|
||||
### AWS
|
||||
|
||||
- [✔️] ☁️ 49 > [AWS Cloud Overview](2023/day49.md)
|
||||
- [✔️] ☁️ 50 > [Get a Free Tier Account & Enable Billing Alarms](2023/day50.md)
|
||||
- [✔️] ☁️ 50 > [Create Free Tier Account & Enable Billing Alarms](2023/day50.md)
|
||||
- [✔️] ☁️ 51 > [Infrastructure as Code (IaC) and CloudFormation](2023/day51.md)
|
||||
- [] ☁️ 52 > [](2023/day52.md)
|
||||
- [] ☁️ 53 > [](2023/day53.md)
|
||||
- [] ☁️ 54 > [](2023/day54.md)
|
||||
- [] ☁️ 55 > [](2023/day55.md)
|
||||
- [✔️] ☁️ 52 > [Identity and Access Management (IAM)](2023/day52.md)
|
||||
- [✔️] ☁️ 53 > [AWS Systems Manager](2023/day53.md)
|
||||
- [✔️] ☁️ 54 > [AWS CodeCommit](2023/day54.md)
|
||||
- [✔️] ☁️ 55 > [AWS CodePipeline](2023/day55.md)
|
||||
|
||||
### Redhat OpenShift
|
||||
### Red Hat OpenShift
|
||||
|
||||
- [] ⛑️ 56 > [](2023/day56.md)
|
||||
- [] ⛑️ 57 > [](2023/day57.md)
|
||||
- [✔️] ⛑️ 56 > [What does Red Hat OpenShift bring to the party? An Overview](2023/day56.md)
|
||||
- [] ⛑️ 57 > [Understanding the OpenShift Architecture + Spinning up an instance](2023/day57.md)
|
||||
- [] ⛑️ 58 > [](2023/day58.md)
|
||||
- [] ⛑️ 59 > [](2023/day59.md)
|
||||
- [] ⛑️ 60 > [](2023/day60.md)
|
||||
@ -126,18 +126,18 @@ Or contact us via Twitter, my handle is [@MichaelCade1](https://twitter.com/Mich
|
||||
|
||||
### Databases
|
||||
|
||||
- [] 🛢 63 > [](2023/day63.md)
|
||||
- [] 🛢 64 > [](2023/day64.md)
|
||||
- [] 🛢 65 > [](2023/day65.md)
|
||||
- [] 🛢 66 > [](2023/day66.md)
|
||||
- [] 🛢 67 > [](2023/day67.md)
|
||||
- [] 🛢 68 > [](2023/day68.md)
|
||||
- [] 🛢 69 > [](2023/day69.md)
|
||||
- [] 🛢 63 > [An introduction to databases](2023/day63.md)
|
||||
- [] 🛢 64 > [Querying data in databases](2023/day64.md)
|
||||
- [] 🛢 65 > [Backing up and restoring databases](2023/day65.md)
|
||||
- [] 🛢 66 > [High availability and disaster recovery](2023/day66.md)
|
||||
- [] 🛢 67 > [Performance tuning](2023/day67.md)
|
||||
- [] 🛢 68 > [Database security](2023/day68.md)
|
||||
- [] 🛢 69 > [Monitoring and troubleshooting database issues](2023/day69.md)
|
||||
|
||||
### Serverless
|
||||
|
||||
- [] 👩🏿💻 70 > [](2023/day70.md)
|
||||
- [] 👩🏿💻 71 > [](2023/day71.md)
|
||||
- [✔️] 👩🏿💻 70 > [What is Serverless?](2023/day70.md)
|
||||
- [✔️] 👩🏿💻 71 > [Serverless Compute](2023/day71.md)
|
||||
- [] 👩🏿💻 72 > [](2023/day72.md)
|
||||
- [] 👩🏿💻 73 > [](2023/day73.md)
|
||||
- [] 👩🏿💻 74 > [](2023/day74.md)
|
||||
|
219
2023/day34.md
@ -0,0 +1,219 @@
|
||||
# Runtime access control
|
||||
|
||||
Runtime access control is crucial in a computer system because it helps ensure the security and integrity of a computer system cluster and the applications running on it. A computer system is a complex system with many moving parts, and it is essential to control access to these components to prevent unauthorized access or malicious activities.
|
||||
|
||||
Here are some reasons why runtime access control is important in a computer system:
|
||||
|
||||
Protects the Cluster from Unauthorized Access: Access control ensures that only authorized users or processes can interact with the computer system API server or cluster components. Unauthorized access could result in data breaches, theft of sensitive information, or compromise of the entire cluster.
|
||||
|
||||
Prevents Misuse of Resources: computer system manages and allocates resources such as CPU, memory, and network bandwidth. Access control helps ensure that these resources are used appropriately and that applications are not using more resources than they need.
|
||||
|
||||
Ensures Compliance: Access control helps ensure that the computer system and the applications running on it comply with organizational policies, industry standards, and regulatory requirements such as HIPAA, GDPR, or PCI-DSS.
|
||||
|
||||
Facilitates Auditing and Accountability: Access control provides an audit trail of who accessed what resources and when. This information is useful for tracking down security incidents, troubleshooting, and compliance reporting.
|
||||
|
||||
For example, Kubernetes provides several mechanisms for access control, including authnetication mechanisms, access control (RBAC), admission control, Network Policies, and more. It is important to properly configure and manage access control to ensure the security and reliability of a computer system cluster.
|
||||
|
||||
## Authentication
|
||||
|
||||
Authentication is the process of verifying the identity of a user or process attempting to access the Kubernetes API server or cluster resources. Kubernetes provides several authentication mechanisms, including X.509 client certificates, bearer tokens, and OpenID Connect (OIDC) tokens.
|
||||
|
||||
X.509 client certificates are the most secure and widely used authentication mechanism in Kubernetes. In this method, a client presents a valid X.509 client certificate to the API server, which verifies the certificate against a trusted Certificate Authority (CA).
|
||||
|
||||
Bearer tokens are another popular authentication mechanism in Kubernetes. A bearer token is a string of characters that represents the identity of a user or process. The API server validates the token against a configured TokenReview API server.
|
||||
|
||||
OIDC tokens are a newer authentication mechanism in Kubernetes. OIDC is an identity layer on top of the OAuth 2.0 protocol that enables authentication and authorization using third-party identity providers such as Google, Azure, or Okta.
|
||||
|
||||
Kubernetes also supports Webhook token authentication, in which the API server sends an authentication request to a configured webhook service. The webhook service validates the request and returns a response indicating whether the authentication succeeded or failed.
|
||||
|
||||
In addition to authentication, Kubernetes provides authorization mechanisms that control access to specific resources. Role-Based Access Control (RBAC) is the most widely used authorization mechanism in Kubernetes. RBAC allows administrators to define roles and permissions for users or groups of users based on their job functions or responsibilities.
|
||||
|
||||
Kubernetes also provides other authorization mechanisms such as Attribute-Based Access Control (ABAC) and Node Authorization.
|
||||
|
||||
Authentication and authorization are essential components of securing a Kubernetes cluster. They help ensure that only authorized users and processes can access cluster resources and protect against unauthorized access, data breaches, and other security threats.
|
||||
|
||||
Kubernetes administrators should carefully configure and manage authentication and authorization to ensure the security and reliability of their clusters. Best practices include using secure authentication mechanisms such as X.509 certificates, restricting access to the Kubernetes API server, and enabling RBAC to control access to resources.
|
||||
|
||||
Kubernetes authentication is a complex topic that requires a deep understanding of the underlying security mechanisms and protocols. Kubernetes administrators and security professionals should stay up-to-date with the latest authentication and authorization best practices and security updates to keep their clusters secure and compliant.
|
||||
|
||||
No question, that authentication tokens and credentials are cornerstones of the security of a Kubernetes cluster. This is true for any computer system for access control.
|
||||
|
||||
Here is an example of how different credentials can be used in a way that was not planned by the design.
|
||||
|
||||
I assume that your Minikube is still up and running. You can obtain the Kubernetes Service Account token of the Kube-proxy component with the following command:
|
||||
```bash
|
||||
kubectl -n kube-system exec $(kubectl get pods -n kube-system | grep kube-proxy | head -n 1 | awk '{print $1}') -- cat /var/run/secrets/kubernetes.io/serviceaccount/token
|
||||
```
|
||||
|
||||
Note: if you want to learn more about the content of this JWT go to [jwt.io](https://jwt.io/) and parse the token you got with the previous command!
|
||||
|
||||

|
||||
|
||||
We will see here how easy it is to masquerade as someone else if the above token is obtained. We will set up `kubectl` to use this instead the default credentials.
|
||||
|
||||
```bash
|
||||
export KUBE_PROXY_POD_NAME=`kubectl get pods -n kube-system | grep kube-proxy | head -n 1 | awk '{print $1}'`
|
||||
export TOKEN=`kubectl -n kube-system exec $KUBE_PROXY_POD_NAME -- cat /var/run/secrets/kubernetes.io/serviceaccount/token`
|
||||
export API_SERVER_URL=`kubectl config view --minify --output jsonpath="{.clusters[*].cluster.server}"`
|
||||
kubectl -n kube-system exec $KUBE_PROXY_POD_NAME -- cat /var/run/secrets/kubernetes.io/serviceaccount/ca.crt > /tmp/ca.crt
|
||||
kubectl config set-cluster access-test --server=$API_SERVER_URL --certificate-authority=/tmp/ca.crt
|
||||
kubectl config set-context access-test --cluster=access-test
|
||||
kubectl config set-credentials user --token=$TOKEN
|
||||
kubectl config set-context access-test --user=user
|
||||
kubectl config use-context access-test
|
||||
```
|
||||
|
||||
Now that we have set up our `kubectl` to use the above token we "stole" from the Kube-proxy, we can see it working in action:
|
||||
```bash
|
||||
kubectl get nodes
|
||||
```
|
||||
|
||||
Voila! 😄
|
||||
|
||||
This was a simple example of how credentials can be used by malicious actors in case they're stolen.
|
||||
|
||||
(if you used Minikube, revert to your original context by `kubectl config use-context minikube`)
|
||||
|
||||
## Authorization
|
||||
|
||||
Let's continue the above journey with what is after authentication.
|
||||
|
||||
Kubernetes Role-Based Access Control (RBAC) is a security mechanism used to control access to resources within a Kubernetes cluster. RBAC is used to define policies that determine what actions users and service accounts are allowed to perform on Kubernetes resources.
|
||||
|
||||
In Kubernetes, RBAC works by defining two main types of objects: roles and role bindings. A role is a collection of permissions that can be applied to one or more resources in a Kubernetes cluster. Role binding is used to grant a role to a user, group of users or service accounts.
|
||||
|
||||
When a user or service account attempts to perform an action on a resource in Kubernetes, the Kubernetes API server checks the permissions defined in the relevant role binding. If the user or service account is authorized to perform the action, the API server grants access. If the user or service account is not authorized, the API server denies access.
|
||||
|
||||
RBAC can be used to control access to a wide range of Kubernetes resources, including pods, services, deployments, and more. RBAC policies can be defined at various levels of the Kubernetes cluster, including the cluster level, namespace level, and individual resource level.
|
||||
|
||||
RBAC can be configured using the Kubernetes API or using tools such as `kubectl`. With RBAC, administrators can enforce strict security policies and help to ensure that only authorized users and service accounts are able to access and modify Kubernetes resources, reducing the risk of unauthorized access and data breaches.
|
||||
|
||||
In the case above with Kube-proxy, this workload has a service account. How do we know it? Run the following command:
|
||||
```bash
|
||||
kubectl -n kube-system get daemonset kube-proxy -o=jsonpath='{.spec.template.spec.serviceAccount}'
|
||||
```
|
||||
It returns `kube-proxy` as the associated service account.
|
||||
|
||||
If you list all the `ClusterRoleBindings`, you will see that this service account is bound with `kubeadm:node-proxier` and `system:node-proxier` `ClusterRoles`.
|
||||
```bash
|
||||
kubectl get clusterrolebindings -o wide | grep kube-proxy
|
||||
```
|
||||
|
||||
You can see what these `ClusterRoles` allow this service account to do by doing querying them with `kubectl`:
|
||||
```bash
|
||||
kubectl get clusterrole system:node-proxier -o yaml
|
||||
```
|
||||
|
||||
You will see that this role enables:
|
||||
* List and watch on `endpoint` and `service` objects
|
||||
* Get, list and watch on `nodes`
|
||||
* Create, patch, update on `events`
|
||||
|
||||
This is why we did `kubectl get nodes` in the previous section.
|
||||
|
||||
Another example is the ClusterRole called `system:controller:deployment-controller`, it is the role associated with the service account of the Deployment Controller component which is in charge of managing `ReplicaSets` for `Deployments` and they need to make sure that the downstream object (`ReplicaSet`) is always consolidated with the definitions of `Deployments`.
|
||||
|
||||
```bash
|
||||
kubectl get clusterrole system:controller:deployment-controller -o yaml
|
||||
```
|
||||
|
||||
Here you can see that this role for example authorizes the subject to create, delete, update and etc. on `ReplicaSets`, which makes sense given the functionality this component has.
|
||||
|
||||
Is Kubernetes RBAC a good authorization system? Yes, but...
|
||||
* It can be a bit complex to manage sometimes
|
||||
* Authorization can be given to combinations of verb and object (what can you do with what)
|
||||
|
||||
The latter is not an obvious limitation. You can allow someone to create Pods but you cannot limit the same subject to creating only un-privileged Pods since both are the same objects.
|
||||
|
||||
This brings us to the last part of today's content.
|
||||
|
||||
## Runtime admission controllers
|
||||
|
||||
In Kubernetes, an admission controller is a type of plug-in that intercepts requests to the Kubernetes API server before they are processed, allowing administrators to enforce custom policies and restrictions on the resources being created or modified.
|
||||
|
||||
Admission controllers are used to validate and modify resource specifications before they are persisted to the Kubernetes API server. They can be used to enforce a wide range of policies, such as ensuring that all pods have a specific label, preventing the creation of privileged containers, or restricting access to certain namespaces.
|
||||
|
||||
Admission controllers in Kubernetes can be either/or type of:
|
||||
* MutatingAdmissionWebhook: This controller can modify or mutate requests to the Kubernetes API server before they are persisted.
|
||||
* ValidatingAdmissionWebhook: This controller can validate or reject requests to the Kubernetes API server based on custom policies.
|
||||
|
||||
Admission controllers can be customized or extended to meet the specific needs of an organization or application. By using admission controllers, administrators can ensure that resources in the Kubernetes cluster conform to specific policies and security requirements, helping to reduce the risk of security breaches and ensuring a consistent and secure deployment environment.
|
||||
|
||||
There are two great examples of open-source admission controller projects: [OPA Gatekeeper](https://open-policy-agent.github.io/gatekeeper/website/docs/) and [Kyverno](https://kyverno.io/). We will use Kyverno today.
|
||||
|
||||
|
||||
Kyverno allows users to define policies as code and apply them to Kubernetes resources such as pods, deployments, services, and more. Policies can be written in YAML or JSON and can be customized to enforce specific requirements for an organization or application. Kyverno policies can be applied to resources at the time of creation or updated later as needed.
|
||||
|
||||
Kyverno is a powerful tool that can help to ensure that Kubernetes resources are configured and managed according to organizational policies and best practices. It can help to improve the security, compliance, and consistency of Kubernetes deployments while also simplifying policy management for administrators and developers.
|
||||
|
||||
To install Kyverno on our Minikube, use the following commands:
|
||||
```bash
|
||||
helm repo add kyverno https://kyverno.github.io/kyverno/
|
||||
helm repo update
|
||||
helm install kyverno kyverno/kyverno -n kyverno --create-namespace --set replicaCount=1
|
||||
helm install kyverno-policies kyverno/kyverno-policies -n kyverno
|
||||
```
|
||||
|
||||
Let's create a policy that prevents privileged Pods.
|
||||
```bash
|
||||
kubectl apply -f - << EOF
|
||||
apiVersion: kyverno.io/v1
|
||||
kind: ClusterPolicy
|
||||
metadata:
|
||||
name: no-privileged-containers
|
||||
annotations:
|
||||
policies.kyverno.io/title: No Privileged Containers
|
||||
policies.kyverno.io/subject: Pod
|
||||
spec:
|
||||
validationFailureAction: Enforce
|
||||
rules:
|
||||
- name: no-privileged-containers
|
||||
match:
|
||||
any:
|
||||
- resources:
|
||||
kinds:
|
||||
- Pod
|
||||
validate:
|
||||
message: >-
|
||||
Privileged containers are not allowed!
|
||||
pattern:
|
||||
spec:
|
||||
containers:
|
||||
- =(securityContext):
|
||||
=(privileged): "false"
|
||||
EOF
|
||||
```
|
||||
|
||||
You can see that this policy validates that the `privileged` flag is false under `securityContext` field in Pods.
|
||||
|
||||
Now if I try to spawn up a privileged Pod, it will fail. Try it:
|
||||
|
||||
```bash
|
||||
kubectl apply -f - << EOF
|
||||
apiVersion: v1
|
||||
kind: Pod
|
||||
metadata:
|
||||
name: privileged-container-demo
|
||||
spec:
|
||||
containers:
|
||||
- name: privileged-container-demo
|
||||
image: nginx:latest
|
||||
securityContext:
|
||||
privileged: true
|
||||
EOF
|
||||
```
|
||||
|
||||
This should fail (without the Kyverno policy, this will succeed.
|
||||
|
||||
```
|
||||
admission webhook "validate.kyverno.svc-fail" denied the request:
|
||||
|
||||
policy Pod/default/privileged-container-demo for resource violation:
|
||||
|
||||
no-privileged-containers:
|
||||
no-privileged-containers: 'validation error: Privileged containers are not allowed!.
|
||||
rule no-privileged-containers failed at path /spec/containers/0/securityContext/privileged/'
|
||||
```
|
||||
|
||||
I hope this short intro gave a little taste of how admission controllers can help you to enforce runtime rules over a Kubernetes cluster!.
|
||||
|
155
2023/day48.md
@ -0,0 +1,155 @@
|
||||
# Day 47 - Let's build an App in Python
|
||||
|
||||
Let's create a simple blog app with the help of [Flask](https://flask.palletsprojects.com/en/2.2.x/) that supports posts in [markdown.](https://www.markdownguide.org/basic-syntax/)
|
||||
|
||||
## Initiating virtual env and installing packages
|
||||
|
||||
Let's create a directory for our blog project. After you have created your project directory, create virtual environment using the following commands:
|
||||
- Windows
|
||||
``` bash
|
||||
c:\>python -m venv c:\path\to\myenv
|
||||
```
|
||||
- Linux//MacOs
|
||||
``` bash
|
||||
python3 -m venv /path/to/new/virtual/environment
|
||||
```
|
||||
|
||||
Activate the virtual environment:
|
||||
- Windows cmd
|
||||
``` bash
|
||||
C:\> <venv>\Scripts\activate.bat
|
||||
```
|
||||
|
||||
- Windows powershell
|
||||
``` powershell
|
||||
<venv>\Scripts\Activate.ps1
|
||||
```
|
||||
|
||||
- Linux//MacOs
|
||||
``` bash
|
||||
source <venv>/bin/activate
|
||||
```
|
||||
|
||||
Now let's use `pip` to install required modules and packages that we will be using in this project.
|
||||
``` bash
|
||||
pip install flask markdown
|
||||
```
|
||||
|
||||
## Creating the flask app
|
||||
|
||||
First, create a new Flask app, by creating a file in root of the project directory called `main.py`:
|
||||
|
||||
``` python
|
||||
from flask import Flask, render_template
|
||||
import markdown
|
||||
|
||||
app = Flask(__name__)
|
||||
```
|
||||
|
||||
Define a route for the home page:
|
||||
``` python
|
||||
@app.route('/')
|
||||
def home():
|
||||
return render_template('index.html')
|
||||
```
|
||||
|
||||
Define a route to handle requests for individual blog posts:
|
||||
|
||||
``` python
|
||||
@app.route('/posts/<path:path>')
|
||||
def post(path):
|
||||
with open(f'posts/{path}.md', 'r') as file:
|
||||
content = file.read()
|
||||
html = markdown.markdown(content)
|
||||
return render_template('post.html', content=html)
|
||||
```
|
||||
|
||||
Create templates for the home page and individual blog posts, we can do this by creating a new directory in root of project called `templates`. And then further create the two following `html` files:
|
||||
|
||||
- `index.html`:
|
||||
|
||||
``` html
|
||||
<!DOCTYPE html>
|
||||
<html>
|
||||
<head>
|
||||
<title>My Blog</title>
|
||||
</head>
|
||||
<body>
|
||||
<h1>My Blog</h1>
|
||||
{% for post in posts %}
|
||||
<h2><a href="/posts/{{ post }}">{{ post }}</a></h2>
|
||||
{% endfor %}
|
||||
</body>
|
||||
</html>
|
||||
```
|
||||
|
||||
- `post.html`:
|
||||
|
||||
``` html
|
||||
<!DOCTYPE html>
|
||||
<html>
|
||||
<head>
|
||||
<title>{{ title }}</title>
|
||||
</head>
|
||||
<body>
|
||||
<h1>{{ title }}</h1>
|
||||
<div>{{ content|safe }}</div>
|
||||
</body>
|
||||
</html>
|
||||
```
|
||||
|
||||
Modify the home route to display a list of blog post titles:
|
||||
|
||||
``` python
|
||||
@app.route('/')
|
||||
def home():
|
||||
posts = []
|
||||
for file in os.listdir('posts'):
|
||||
if file.endswith('.md'):
|
||||
title = file[:-3]
|
||||
posts.append(title)
|
||||
return render_template('index.html', posts=posts)
|
||||
```
|
||||
|
||||
## Adding markdown posts
|
||||
|
||||
Now before running the app, let's add few posts.
|
||||
Create a directory called `posts` and add some Markdown files with blog post content.
|
||||
Let's add a `hello.md`:
|
||||
|
||||
``` markdown
|
||||
# Hello
|
||||
|
||||
This is my first blog post
|
||||
### Heading level 3
|
||||
#### Heading level 4
|
||||
##### Heading level 5
|
||||
###### Heading level 6
|
||||
|
||||
I just love **bold text**.
|
||||
|
||||
```
|
||||
|
||||
Now, let's run the app, type the following command:
|
||||
|
||||
``` bash
|
||||
python main.py
|
||||
```
|
||||
|
||||
And you should see the following output in the termainal:
|
||||
|
||||
``` bash
|
||||
python main.py * Serving Flask app 'main' * Debug mode: on WARNING: This is a development server. Do not use it in a production deployment. Use a production WSGI server instead.
|
||||
* Running on http://127.0.0.1:5000
|
||||
Press CTRL+C to quit
|
||||
* Restarting with stat
|
||||
* Debugger is active!
|
||||
```
|
||||
|
||||
Here is how it would look, I have 2 blog posts and have some gifs in my blog posts. Navigate to `127.0.0.0:5000` in a browser window:
|
||||
|
||||

|
||||
|
||||
If we click on the `hello` blog post:
|
||||
|
||||

|
@ -1,5 +1,13 @@
|
||||
# Day 49: AWS Cloud Overview
|
||||
|
||||
Welcome to the AWS section of the 90 Days of DevOps! Picking 7 items to learn about is difficult for several reasons:
|
||||
1. At last count, there were 250+ AWS services
|
||||
2. Each service could get it's own multi-day deep dive 😅
|
||||
|
||||
Because of that, we're going to do a gentle intro that starts off easy, goes into some very DevOps-salient services, then ends with a section-capstone project that will give you a lot of exposure to AWS DevOps services.
|
||||
|
||||
I hope you enjoy the next 7 days as much as I did creating them. If you have any questions feel free to ask!
|
||||
|
||||
AWS Cloud is a cloud computing platform provided by Amazon Web Services (AWS). It offers a wide range of services, including computing, storage, networking, database, analytics, machine learning, security, and more. AWS Cloud allows businesses and organizations to access these services on a pay-as-you-go basis, which means they only pay for what they use and can scale their resources up or down as needed.
|
||||
|
||||

|
||||
|
@ -0,0 +1,56 @@
|
||||
# Day 52: Identity and Access Management (IAM)
|
||||
|
||||
As cloud computing continues to gain popularity, more and more organizations are turning to cloud platforms to manage their infrastructure. However, with this comes the need to ensure proper security measures are in place to protect data and resources. One of the most critical tools for managing security in AWS is Identity and Access Management (IAM).
|
||||
|
||||
## What is AWS IAM?
|
||||
||
|
||||
|:-:|
|
||||
| <i>IAM is (1) WHO (2) CAN ACCESS (3) WHAT</i>|
|
||||
|
||||
|
||||
AWS IAM is a web service that allows you to manage users and their access to AWS resources. With IAM, you can create and manage AWS users and groups, control access to AWS resources, and set permissions that determine what actions users can perform on those resources. IAM provides fine-grained access control, which means that you can grant or deny permissions to specific resources at a granular level.
|
||||
|
||||
IAM is an essential tool for securing your AWS resources. Without it, anyone with access to your AWS account would have unrestricted access to all your resources. With IAM, you can control who has access to your resources, what actions they can perform, and what resources they can access. IAM also enables you to create and manage multiple AWS accounts, which is essential as large organizations will always have many accounts that will need some level of interaction with each other:
|
||||
|
||||
||
|
||||
|:-:|
|
||||
| <i>Multi-Account IAM access is essential knowledge</i>|
|
||||
|
||||
|
||||
## How to Get Started with AWS IAM
|
||||
|
||||
Getting started with AWS IAM is straightforward. Here are the steps you need to follow:
|
||||
|
||||
### Step 1: Create an AWS Account
|
||||
|
||||
The first step is to create an AWS account if you don't already have one. We did this on day 50 so you should be good to go 😉
|
||||
|
||||
### Step 2: Set up IAM
|
||||
|
||||
Once you have an AWS account, you can set up IAM by navigating to the IAM console. The console is where you'll manage IAM users, groups, roles, and policies.
|
||||
|
||||
### Step 3: Create an IAM User
|
||||
|
||||
The next step is to create an IAM user. An IAM user is an entity that you create in IAM that represents a person or service that needs access to your AWS resources. When you create an IAM user, you can specify the permissions that the user should have. One of the homework assignments from Day 50 was to [Create an IAM user](https://docs.aws.amazon.com/IAM/latest/UserGuide/id_users_create.html), if you haven't completed that go back and make one now.
|
||||
|
||||
### Step 4: Create an IAM Group
|
||||
|
||||
After you've created an IAM user, the next step is to create an IAM group. An IAM group is a collection of IAM users. When you create an IAM group, you can specify the permissions that the group should have. Watch "IAM Basics" and read "IAM User Guide:Getting Started" in the resources section to accomplish this.
|
||||
|
||||
### Step 5: Assign Permissions to the IAM Group
|
||||
|
||||
Once you've created an IAM group, you can assign permissions to the group. This involves creating an IAM policy that defines the permissions that the group should have. You can then attach the policy to the group. Watch "IAM Tutorial & Deep Dive" and go through the IAM Tutorial in the resources section to accomplish this.
|
||||
|
||||
### Step 6: Test the IAM User
|
||||
|
||||
After you've assigned permissions to the IAM group, you can test the IAM user to ensure that they have the correct permissions. To do this, you can log in to the AWS Management Console using the IAM user's credentials and attempt to perform the actions that the user should be able to perform.
|
||||
|
||||
## Resources:
|
||||
[IAM Basics](https://youtu.be/iF9fs8Rw4Uo)
|
||||
|
||||
[IAM User Guide: Getting started](https://docs.aws.amazon.com/IAM/latest/UserGuide/getting-started.html)
|
||||
|
||||
[IAM Video Tutorial & Deep Dive](https://youtu.be/ExjW3HCFG1U)
|
||||
|
||||
[IAM Tutorial: Delegate access across AWS accounts using IAM roles](https://docs.aws.amazon.com/IAM/latest/UserGuide/tutorial_cross-account-with-roles.html)
|
||||
|
@ -0,0 +1,49 @@
|
||||
# Day 53: AWS Systems Manager
|
||||
|
||||

|
||||
|
||||
AWS Systems Manager is a fully managed service that allows users to manage and automate operational tasks both on their AWS and on-premises resources. It provides a centralized platform for managing AWS resources, virtual machines, and applications. It enables DevOps professionals to automate operational tasks, maintain compliance, and reduce operational costs.
|
||||
|
||||
With AWS Systems Manager, users can perform tasks such as automating patch management, automating OS and application deployments, creating and managing Amazon Machine Images (AMIs), and monitoring resource utilization. It also provides a set of tools for configuring and managing instances, which includes run commands, state manager, inventory, and maintenance windows.
|
||||
|
||||
Furthermore, AWS Systems Manager provides a unified view of operational data, allowing users to visualize and monitor operational data across their AWS infrastructure, including EC2 instances, on-premises servers, and AWS services. This allows users to identify and resolve issues faster, improving operational efficiency and reducing downtime.
|
||||
|
||||
## How to Get Started with AWS System Manager?
|
||||
|
||||
Getting started with AWS System Manager is as easy as 1, 2, 3, 4 😄:
|
||||
|
||||

|
||||
|
||||
### Step 1: Navigate to the AWS System Manager Console
|
||||
|
||||
Once you have an AWS account, create 2 windows servers and 2 linus servers (free tier of course 😉) and navigate to the AWS System Manager console. The console provides a unified interface for managing AWS resources, including EC2 instances, on-premises servers, and other resources:
|
||||
|
||||

|
||||
Click the "get started" button and choose your preferred region (I picked us-east-1)
|
||||
|
||||
### Step 2: Choose a configuration type
|
||||
|
||||
The next step is to configure AWS Systems Manager to manage your resources. You can do this by selecting one of the quick setup common tasks (or create a custom setup type of your own choosing):
|
||||

|
||||
For my needs I'm going to choose "Patch Manager" - in the resources below we will have additional scenarios that you can experiment with. Watch "Patch and manage your AWS Instances in MINUTES with AWS Systems Manager" to see this step in action.
|
||||
|
||||
### Step 3: Specify configuration options
|
||||
|
||||
Each configuration type has a unique set of parameters to apply for this step...
|
||||
||
|
||||
|:-:|
|
||||
| <i>You will see something different depending on which quick start config you chose</i>|
|
||||
|
||||
so I won't be getting into the required arguments for each one. Generally speaking the next step is to create a resource group to organize your resources. Resource groups are collections of resources that share common attributes. By grouping resources, you can view them collectively and apply policies and actions to them together. Watch "Patch and manage your AWS Instances in MINUTES with AWS Systems Manager" to see this step in action.
|
||||
|
||||
### Step 4: Deploy, Review, and Manage Your Resources
|
||||
|
||||
Once you have created a resource group, you can view and manage your resources from the AWS System Manager console. You can also create automation workflows, run patch management, and perform other operations on your resources.
|
||||
|
||||
## Resources:
|
||||
[AWS Systems Manager Introduction](https://youtu.be/pSVK-ingvfc)
|
||||
|
||||
[Patch and manage your AWS Instances in MINUTES with AWS Systems Manager](https://youtu.be/DEQFJba3h4M)
|
||||
|
||||
[Getting started with AWS System Manager](https://docs.aws.amazon.com/systems-manager/latest/userguide/getting-started-launch-managed-instance.html)
|
||||
|
@ -0,0 +1,32 @@
|
||||
# Day 54: AWS CodeCommit
|
||||
|
||||

|
||||
|
||||
|
||||
AWS CodeCommit is a fully managed source control service provided by Amazon Web Services (AWS) that makes it easy for developers to host and manage private Git repositories. Think "GitHub but with less features" 🤣 (j/k, see the resource "CodeCommit vs GitHub" for a breakdown) It allows teams to collaborate on code and keep their code securely stored in the cloud, with support for secure access control, encryption, and automatic backups.
|
||||
|
||||
With AWS CodeCommit, developers can easily create, manage, and collaborate on Git repositories with powerful code review and workflow tools. It integrates seamlessly with other AWS services like AWS CodePipeline and AWS CodeBuild, making it easier to build and deploy applications in a fully automated manner.
|
||||
|
||||
Some key features of AWS CodeCommit include:
|
||||
|
||||
- Git-based repositories with support for code reviews and pull requests
|
||||
- Integration with AWS Identity and Access Management (IAM) for secure access control (this is a big plus)
|
||||
- Encryption of data at rest and in transit
|
||||
- Highly scalable and available, with automatic backups and failover capabilities
|
||||
- Integration with other AWS developer tools like AWS CodePipeline and AWS CodeBuild
|
||||
|
||||
In order to effectively leverage CodeCommit, you of course need to know how to use Git. There are [many](https://www.youtube.com/playlist?list=PL2rC-8e38bUXloBOYChAl0EcbbuVjbE3t) [excellent](https://youtu.be/tRZGeaHPoaw) [Git](https://youtu.be/USjZcfj8yxE) [tutorials](https://youtu.be/RGOj5yH7evk) out there, (and that's not my section anyway 😉) so I won't go into that myself.
|
||||
|
||||
Overall, AWS CodeCommit is a powerful tool for teams that need to collaborate on code, manage their repositories securely, and streamline their development workflows.
|
||||
|
||||
|
||||
|
||||
## Resources:
|
||||
|
||||
[AWS CodeCommit User Guide](https://docs.aws.amazon.com/codecommit/latest/userguide/welcome.html)
|
||||
|
||||
[AWS CodeCommit Overview](https://youtu.be/5kFmfgFYOx4)
|
||||
|
||||
[AWS CodeCommit tutorial: your first Repo, Commit and Push](https://youtu.be/t7M8pHCh5Xs)
|
||||
|
||||
[AWS CodeCommit vs GitHub: Which will Shine in 2023?](https://appwrk.com/aws-codecommit-vs-github)
|
@ -0,0 +1,62 @@
|
||||
# Day 55: AWS CodePipeline
|
||||
|
||||
<i>On this last day of AWS services we are going to talk about a big one that has a lot of moving parts and integrations. There are a few free resources out there that will help in your learning/understanding of this... but honestly some of the best ones will cost you some money. I will list them out seperately in the resources section and call them out, but I would be remiss in NOT mentioning them as they are fantastic for learning this complex service</i>
|
||||
|
||||
<b>CodePipeline</b> is a fully managed continuous delivery service that allows you to automate your IaC or software release processes. It enables you to create pipelines that build, test, and deploy your code changes continuously and (with proper testing in place) reliably:
|
||||
|
||||

|
||||
|
||||
With CodePipeline, you can create pipelines that automate your build, test, and deployment workflows, ensuring that your code changes are reliably deployed to your target environments. It enables you to achieve faster release cycles, improve collaboration among development and operations teams, and improve the overall quality and reliability of your software releases.
|
||||
|
||||
AWS CodePipeline integrates with other AWS services:
|
||||
- [Source Action Integrations](https://docs.aws.amazon.com/codepipeline/latest/userguide/integrations-action-type.html#integrations-source)
|
||||
- [Build Action Integrations](https://docs.aws.amazon.com/codepipeline/latest/userguide/integrations-action-type.html#integrations-build)
|
||||
- [Test Action Integrations](https://docs.aws.amazon.com/codepipeline/latest/userguide/integrations-action-type.html#integrations-test)
|
||||
- [Deploy Action Integrations](https://docs.aws.amazon.com/codepipeline/latest/userguide/integrations-action-type.html#integrations-deploy)
|
||||
- [Approval Action Integrations](https://docs.aws.amazon.com/codepipeline/latest/userguide/integrations-action-type.html#integrations-approval)
|
||||
- [Invoke Action Integrations](https://docs.aws.amazon.com/codepipeline/latest/userguide/integrations-action-type.html#integrations-invoke)
|
||||
|
||||
|
||||
It also integrates with third-party tools such as GitHub, Jenkins, and Bitbucket. You can use AWS CodePipeline to manage your application updates across multiple AWS accounts and regions.
|
||||
|
||||
## Getting started with AWS CodePipeline
|
||||
|
||||
To get started with AWS CodePipeline, there are several excellent [tutorials](https://docs.aws.amazon.com/codepipeline/latest/userguide/tutorials.html) in the [AWS User Guide](https://docs.aws.amazon.com/codepipeline/latest/userguide/welcome.html). They all basically break down into the following 3 steps:
|
||||
|
||||
### Step 1: Create an IAM role
|
||||
|
||||
You need to create an IAM role that allows AWS CodePipeline to access the AWS resources required to run your pipelines. To create an IAM role, review the steps from [Day 52](day52.md)
|
||||
|
||||
### Step 2: Create a CodePipeline pipeline
|
||||
|
||||
To create a CodePipeline pipeline, go to the AWS CodePipeline console, click on the "Create pipeline" button, and then follow the instructions to create your pipeline. You will need to specify the source location for your code, the build provider you want to use, the deployment provider you want to use, and the IAM role you created in step 2.
|
||||
|
||||
### Step 3: Test and deploy your code changes
|
||||
|
||||
Once you have created your CodePipeline pipeline, you can test and deploy your code changes. AWS CodePipeline will automatically build, test, and deploy your code changes to your target environments. You can monitor the progress of your pipeline in the AWS CodePipeline console.
|
||||
|
||||
## Capstone Project
|
||||
To tie up this AWS section of the 90 Days of DevOps, I recommend that you go through Adrian Cantrill's excellent mini-project, the [CatPipeline](https://www.youtube.com/playlist?list=PLTk5ZYSbd9MgARTJHbAaRcGSn7EMfxRHm). In it you will be exposed to CodeCommit, CodeBuild, CodeDeploy, and CodePipeline in a fun little project that will give you a taste of a day in the life of a DevOps engineer.
|
||||
- [YouTube CatPipeline Playlist](https://www.youtube.com/playlist?list=PLTk5ZYSbd9MgARTJHbAaRcGSn7EMfxRHm)
|
||||
- [GitHub CatPipeline Repo](https://github.com/acantril/learn-cantrill-io-labs/tree/master/aws-codepipeline-catpipeline)
|
||||
|
||||
|
||||
## Resources (Free):
|
||||
|
||||
[AWS: Real-world CodePipeline CI/CD Examples ](https://youtu.be/MNt2HGxClZ0)
|
||||
|
||||
[AWS CodePipeline User Guide](https://docs.aws.amazon.com/codepipeline/latest/userguide/welcome.html)
|
||||
|
||||
[AWS CodePipeline Tutorials](https://docs.aws.amazon.com/codepipeline/latest/userguide/tutorials.html)
|
||||
|
||||
[AWS CodeCommit tutorial: your first Repo, Commit and Push](https://youtu.be/t7M8pHCh5Xs)
|
||||
|
||||
[AWS CodeCommit vs GitHub: Which will Shine in 2023?](https://appwrk.com/aws-codecommit-vs-github)
|
||||
|
||||
## Resources (Paid):
|
||||
There are a number of <i>excellent</i> instructors out there and picking 2-3 is always hard, but [Adrian Cantrill](https://learn.cantrill.io/), [Andrew Brown](https://www.exampro.co/), and [Stephane Maarek](https://www.udemy.com/user/stephane-maarek/) always come to mind when discussing fantastic content out there.
|
||||
|
||||
## Final Thoughts
|
||||
I hope that this section of the 90 Days of DevOps has given you a taste of what is available in the AWS ecosystem.
|
||||
|
||||
Good luck in your studies! Up next is Red Hat OpenShift!
|
111
2023/day56.md
@ -0,0 +1,111 @@
|
||||
# Why choose an enterprise Kubernetes platform?
|
||||
|
||||
In the original #90DaysOfDevOps, [Michael covered Kubernetes](https://github.com/MichaelCade/90DaysOfDevOps/blob/main/2022.md#kubernetes) from what it is, to a basic installation, and looking around the various components that make up the ability to deploy and consume containers.
|
||||
|
||||
When you look at Kubernetes offerings, there is the open-source do-it-yourself offering, typically known as vanilla Kubernetes, or you have an enterprise offering. An enterprise Kubernetes platform provides a more comprehensive set of features and services than a vanilla Kubernetes. These features and services may include advanced security, auditing, and logging capabilities, as well as simplified deployment, configuration, and management capabilities. In addition, enterprise Kubernetes platforms provide support for a variety of cloud providers, and often come with additional features such as multi-cluster management, cloud-native storage and networking, and automated compliance checks. These features and services make it easier for organizations to manage their Kubernetes clusters and ensure that they are secure, compliant, and optimized for performance.
|
||||
|
||||
Some of the leading vendors offering Enterprise Kubernetes include Red Hat OpenShift, Google Anthos, Amazon EKS, Microsoft Azure Kubernetes Service, and VMware Tanzu. There are of course a lot more to add to this list!
|
||||
|
||||
Of course for this section, we are dedicating ourselves to Red Hat OpenShift.
|
||||
|
||||
I decided to cheat a bit for this opening, and consult ChatGPT, the AI Service of the moment, to provide an opinion on why Red Hat OpenShift.
|
||||
|
||||
```
|
||||
Why choose an enterprise Kubernetes platform such as Red Hat OpenShift?
|
||||
|
||||
|
||||
1. Automation and Enterprise-grade Security: Red Hat OpenShift provides an automated platform to help you deploy, manage, and scale your applications quickly and easily while ensuring that the underlying infrastructure is secure.
|
||||
|
||||
2. Open Source: Red Hat OpenShift is built on top of open source technologies such as Kubernetes, Docker, and Red Hat Enterprise Linux. This ensures that your applications are always up-to-date with the latest technologies.
|
||||
|
||||
3. Continuous Integration and Delivery: Red Hat OpenShift provides an integrated set of tools for continuous integration and delivery that make it easy to deploy and manage your applications in a secure and reliable manner.
|
||||
|
||||
4. Scalability and High Availability: Red Hat OpenShift provides a highly available and scalable platform that can handle large amounts of traffic and data without any downtime.
|
||||
|
||||
5. Cost Savings: Red Hat OpenShift provides a cost-effective solution for running your applications in the cloud. You can save money on infrastructure and operations costs by leveraging OpenShift's automated platform.
|
||||
```
|
||||
|
||||
# What does Red Hat OpenShift bring to the party?
|
||||
|
||||
Red Hat has developed OpenShift based on a Open Souce platform (Kubernetes) and even distributes the OpenShift platform enhancements as Open Source as well under the guise of [OpenShift Kubernetes Distribution (OKD)](https://www.okd.io).
|
||||
|
||||
This distribution of Kubernetes is designed around providing the user continous application developlment and multi-tenant capabilities, with embedded security tooling, lifecycle management, and operational management.
|
||||
|
||||
## Out-of-the-Box Features
|
||||
|
||||
Again, this is not an exhausive list:
|
||||
|
||||
- Pre-built container images and a container runtime environment.
|
||||
- Integrated, open source platform and container runtime environment.
|
||||
- Access to a wide range of services such as databases, messaging, and storage.
|
||||
- These services are provided by [Red Hat Ecosystem Catalog](https://catalog.redhat.com/), which allows third parties to ensure their software is certified for the platform, and provides users an easy way to consume the software within the platform.
|
||||
- Platform for deploying custom applications.
|
||||
- Web-based user interface, command line tools, and an API.
|
||||
- Monitoring and logging capabilities.
|
||||
- Security and resource isolation.
|
||||
- Automated build and deployment pipelines.
|
||||
- Continuous integration and continuous delivery (CI/CD) capabilities.
|
||||
|
||||
You can read a more indepth coverage of the benefits and features of Red Hat OpenShift in [this datasheet](https://www.redhat.com/en/resources/openshift-container-platform-datasheet), or a full breakdown on the [Red Hat Developers page](https://developers.redhat.com/products/openshift/overview)
|
||||
|
||||

|
||||
|
||||
## Where can I deploy OpenShift?
|
||||
|
||||
As mentioned early, OpenShift can be deployed across the majority of platforms you can think of, within your own datacenter, either bare metal or to a hypervisor, stretching out into the Cloud offerings, either self-managed or managed by Red Hat. Below are the current platforms listed as covered, at the time of writing this article.
|
||||
|
||||
- Cloud Services Editions - Managed by Red Hat
|
||||
- [AWS (ROSA)](https://www.redhat.com/en/technologies/cloud-computing/openshift/aws)
|
||||
- Billed by AWS
|
||||
- [Azure](https://www.redhat.com/en/technologies/cloud-computing/openshift/azure)
|
||||
- Billed by Microsoft
|
||||
- [IBM Cloud](https://www.redhat.com/en/technologies/cloud-computing/openshift/ibm)
|
||||
- Billed by IBM
|
||||
- [Red Hat Dedicated](https://www.redhat.com/en/resources/openshift-dedicated-datasheet) - Hosted in the cloud, dedicated to a single customer.
|
||||
- Deployed to either AWS or GCP
|
||||
- Billed by Red Hat for the OpenShift Software, billed by AWS/GCP for the cloud infrastructure used
|
||||
|
||||
How is Red Hat OpenShift Service on AWS/Azure different to Red Hat OpenShift Dedicated?
|
||||
|
||||
```
|
||||
|
||||
Red Hat OpenShift Service on AWS is a fully managed implementation of OpenShift Container Platform deployed and operated on AWS, jointly managed and supported by both Red Hat and AWS.
|
||||
|
||||
Red Hat OpenShift Dedicated is a service hosted and fully-managed by Red Hat that offers clusters in a virtual private cloud on AWS or Google Cloud Platform.
|
||||
```
|
||||
|
||||
- Self-Managed Editions - Managed by you
|
||||
- Amazon Web Services (AWS)
|
||||
- Google Cloud Platform (GCP)
|
||||
- Microsoft Azure
|
||||
- Microsoft Azure Stack Hub
|
||||
- Red Hat OpenStack Platform (RHOSP) versions 16.1 and 16.2
|
||||
- IBM Cloud VPC
|
||||
- Nutanix
|
||||
- Red Hat Virtualization (RHV)
|
||||
- VMware vSphere
|
||||
- VMware Cloud (VMC) on AWS
|
||||
- Alibaba Cloud
|
||||
- Bare metal
|
||||
- IBM Z or LinuxONE
|
||||
- IBM Power
|
||||
|
||||
## Getting access to a trial
|
||||
|
||||
Getting started with OpenShift is simple. They give you the ability trial three options:
|
||||
- Developer Sandbox - A hosted instance of OpenShift for you to consume straight away for 30 days
|
||||
- Managed Service - A fully managed Red Hat OpenShift dedicated instance for you to consume, you will need to provide the AWS or GCP cloud account to deploy this into. 60 day trial.
|
||||
- Self-Managed - Deploy OpenShift yourself to any of the platforms named above. 60 day trial.
|
||||
|
||||
You'll need to sign up for a Red Hat account to access the trial and get the software details to deploy.
|
||||
- [Try Red Hat OpenShift](https://www.redhat.com/en/technologies/cloud-computing/openshift/try-it)
|
||||
|
||||
# Next Steps - Understanding the OpenShift Architecture + Spinning up an instance!
|
||||
|
||||
In [day 57](/day57.md) we will dive into the Architecture and components of OpenShift, moving onto spinning up our own OpenShift Environment in [day 58](/day58.md).
|
||||
|
||||
# Resources
|
||||
|
||||
- [OKD](https://www.okd.io/)
|
||||
- [Official Red Hat OpenShift product page](https://www.redhat.com/en/technologies/cloud-computing/openshift)
|
||||
- [Red Hat Hybrid Cloud Learning Hub](https://cloud.redhat.com/learn)
|
134
2023/day63.md
@ -0,0 +1,134 @@
|
||||
# An introduction to databases
|
||||
|
||||
Welcome to the 90DaysOfDevOps database series. Over the next seven days we’ll be talking about all things database related!
|
||||
|
||||
The aim of this series of blog posts is to provide an introduction to databases and their various concepts so that you will be able to make an informed choice when deciding how to store data in your future projects.
|
||||
|
||||
Here’s what we’ll be covering: -
|
||||
|
||||
- An introduction to databases
|
||||
- Querying data in databases
|
||||
- Backing up databases
|
||||
- High availability and disaster recovery
|
||||
- Performance tuning
|
||||
- Database security
|
||||
- Monitoring and troubleshooting database issues
|
||||
|
||||
We’ll also be providing examples to accompany the concepts discussed. In order to do so you will need Docker Desktop installed. Docker can be downloaded here (https://www.docker.com/products/docker-desktop/) and is completely free.
|
||||
|
||||
Alternatives to Docker Desktop can be used (such as Rancher Desktop or Finch) but the examples will focus on Docker.
|
||||
|
||||
We'll be using a custom PostgreSQL image in the examples and connecting with pgAdmin: -
|
||||
https://www.pgadmin.org/
|
||||
|
||||
<br>
|
||||
|
||||
# About Us
|
||||
|
||||
<b>Andrew Pruski</b><br>
|
||||
Andrew is a Field Solutions Architect working for Pure Storage. He is a Microsoft Data Platform MVP, Certified Kubernetes Administrator, and Raspberry Pi tinkerer. You can find him on twitter @dbafromthecold, LinkedIn, and blogging at dbafromthecold.com
|
||||
|
||||
<b>Taylor Riggan</b><br>
|
||||
Taylor is a Sr. Graph Architect on the Amazon Neptune development team at Amazon Web Services. He works with customers of all sizes to help them learn and use purpose-built NoSQL databases via the creation of reference architectures, sample solutions, and delivering hands-on workshops. You can find him on twitter @triggan and LinkedIn.
|
||||
|
||||
<br>
|
||||
|
||||
# Why databases?
|
||||
|
||||
The total amount of data created worldwide is predicted to reach 181 zetabytes by 2025.
|
||||
|
||||
That’s 181 billion terabytes!
|
||||
|
||||

|
||||
|
||||
source - https://www.statista.com/statistics/871513/worldwide-data-created/
|
||||
|
||||
|
||||
Imagine if all that data was stored in flat files, for example excel sheets! OK, storing that data might not be such an issue, just save the file on a networked drive and all good! But what about when it comes to retrieving that data? What about updating a single record amongst hundreds, thousands, millions of files?
|
||||
|
||||
This is where database technologies come into play. Databases give us the ability to not only store data but to easily retrieve, update, and delete individual records.
|
||||
|
||||
<br>
|
||||
|
||||
# Relational databases
|
||||
|
||||
When it comes to databases, there are two main types...relational and non-relational (or NoSQL) databases.
|
||||
|
||||
SQL Server, Oracle, MySQL, and PostgreSQL are all types of relational databases.
|
||||
|
||||
Relational databases were first described by Edgar Codd in 1970 whilst he was working at IBM in a research paper , “A Relation Model of Data for Large Shared Data Banks”.
|
||||
|
||||
This paper led the way for the rise of the various different relational databases that we have today.
|
||||
|
||||
In a relational database, data is organised into tables (containing rows and columns) and these tables have “relationships” with each other.
|
||||
|
||||
For example, a Person table may have an addressID column which points to a row within an Address table, this allows for an end user or application to easily retrieve a record from the Person table and the related record from the Address table.
|
||||
|
||||
The addressID column is a unique “key” in the Address table but is present in the Person table as a “foreign key”.
|
||||
|
||||
The design of the tables and the relations between them in a relational database is said to be the database schema. The process of building this schema is called database normalisation.
|
||||
|
||||
Data is selected, updated, or deleted from a relational database via a programming language called SQL (Structured Query Language).
|
||||
|
||||
In order to support retrieving data from tables in a relational database, there is the concept of “indexes”. In order to locate one row or a subset of rows from a table, indexes provide a way for queries to quickly identify the rows they are looking for, without having to scan all the rows in the table.
|
||||
|
||||
The analogy often used when describing indexes is an index of a book. The user (or query) uses the index to go directly to the page (or row) they are looking for, without having to “scan” all the way through the book from the start.
|
||||
|
||||
Queries accessing databases can also be referred to as transactions…a logical unit of work that accesses and/or modifies the data. In order to maintain consistency in the database, transactions must have certain properties. These properties are referred to as ACID properties: -
|
||||
|
||||
A - Atomic - all of the transaction completes or none of it does<br>
|
||||
C - Consistency - the data modified must not violate the integrity of the database<br>
|
||||
I - Isolation - multiple transactions take place independently of one another<br>
|
||||
D - Durability - Once a transaction has completed, it will remain in the system, even in the event of a system failure.
|
||||
|
||||
We will go through querying relational databases in the next blog post.
|
||||
|
||||
<br>
|
||||
|
||||
# Non-Relational databases
|
||||
|
||||
The downside of relational databases is that the data ingested has to "fit" to the structure of the database schema. But what if we're dealing with large amounts of data that doesn't match that structure?
|
||||
|
||||
This is where non-relational databases come into play. These types of databases are referred to as NoSQL (non-SQL or Not Only SQL) databases and are either schema-free or have a schema that allows for changes in the structure.
|
||||
|
||||
Apache Cassandra, MongoDB, and Redis are all types of NoSQL databases.
|
||||
|
||||
Non-relational databases have existed since the 1960s but the term “NoSQL” was used in 1998 by Carlo Strozzi when naming his Strozzi NoSQL database, however that was still a relational database. It wasn’t until 2009 when Johan Oskarsson reintroduced the term when he organised an event to discuss “open-source distributed, non-relational databases”.
|
||||
There are various different types of NoSQL databases, all of which store and retrieve data differently.
|
||||
|
||||
For example: -
|
||||
|
||||
Apache Cassandra is a wide-column store database. It uses tables, rows, and columns like a relational database but the names and formats of the columns can vary from row to row in the same table. It uses Cassandra Query Language (CSQL) to access the data stored.
|
||||
|
||||
MongoDB is a document store database. Data is stored as objects (documents) within the database that do not adhere to a defined schema. MongoDB supports a variety of methods to access data, such as range queries and regular expression searches.
|
||||
|
||||
Redis is a distributed in-memory key-value database. Redis supports many different data structures - sets, hashes, lists, etc. - https://redis.com/redis-enterprise/data-structures/
|
||||
The records can be identified using a unique key. Redis supports various different programming languages in order to access the data stored.
|
||||
|
||||
NoSQL databases generally do not comply with ACID properties but there are exceptions.
|
||||
|
||||
Each has pros and cons when it comes to storing data, which one to use would be decided on the type of data that is being ingested.
|
||||
|
||||
<br>
|
||||
|
||||
# When to use relational vs non-relational databases
|
||||
|
||||
This is an interesting question and the answer, unfortunately, is it depends.
|
||||
|
||||
It all depends on the type of data being stored, where it is to be stored, and how it is to be accessed.
|
||||
|
||||
If you have data that is highly structured, stored in a central location, and will be accessed by complex queries (such as reports), then a relational database would be the right choice.
|
||||
|
||||
If however, the data is loosely-structured, needs to be available in multiple regions, and will be retrieved with a specific type of query (e.g.- a quick lookup in a key/value store), then a non-relational database would be the right choice.
|
||||
|
||||
There is a massive caveat with the statements above however…there are types of non-relational databases that can handle large, complex queries likewise relational databases have features that allow for data to be available in multiple regions.
|
||||
|
||||
It also comes down to the skillset of the people involved, for example, Andrew is a former SQL Server DBA…so we know what his default choice would be when choosing a type of database!
|
||||
|
||||
While, in contrast, Taylor works on the development team for one of the more popular, cloud-hosted, graph databases, so he is more likely to start with a NoSQL data store.
|
||||
|
||||
The great thing about databases is that there are so many choices to choose from within the realm of commercial offerings, cloud services, and the open-source ecosystem. The amount of choice, however, can be daunting for someone new to this space
|
||||
|
||||
Join us tommorrow when we'll be talking about querying databases.
|
||||
|
||||
Thanks for reading!
|
290
2023/day64.md
@ -0,0 +1,290 @@
|
||||
# Querying data in databases
|
||||
|
||||
Hello and welcome to the second post in the database part of the 90 Days of DevOps blog series!
|
||||
|
||||
In this post we will be going through spinning up an instance of PostgreSQL in a docker container, retrieving data, and then updating that data.
|
||||
|
||||
So let’s get started!
|
||||
|
||||
<br>
|
||||
|
||||
# Software needed
|
||||
|
||||
To follow along with the scripts in this blog post, you will need docker installed and pgAdmin.
|
||||
|
||||
Both are completely free and can be downloaded here: -
|
||||
|
||||
Docker - https://www.docker.com/products/docker-desktop/ <br>
|
||||
pgAdmin - https://www.pgadmin.org/
|
||||
|
||||
<br>
|
||||
|
||||
# Running PostgreSQL
|
||||
|
||||
We have created a custom PostgreSQL docker image which has a demo database ready to go.
|
||||
|
||||
In order to run the container, open a terminal and execute: -
|
||||
|
||||
docker run -d \
|
||||
--publish 5432:5432 \
|
||||
--env POSTGRES_PASSWORD=Testing1122 \
|
||||
--name demo-container \
|
||||
ghcr.io/dbafromthecold/demo-postgres:latest
|
||||
|
||||
This will pull the image down from our github repository and spin up an instance of PostgreSQL with a database, dvdrental, ready to go.
|
||||
|
||||
Note - the image size is 437MB which may or may not be an issue depending on your internet connection
|
||||
|
||||
Confirm the container is up and running with: -
|
||||
|
||||
docker container ls
|
||||
|
||||
Then open pgAdmin and connect with the server name as *localhost* and the password as *Testing1122*
|
||||
|
||||
<br>
|
||||
|
||||
# Selecting data
|
||||
|
||||
Once you’ve connected to the instance of PostgreSQL running in the container, let’s look at the staff table in the dvdrental database. Right click on the dvdrental database in the left-hand menu and select Query Tool.
|
||||
|
||||
To retrieve data from a table we use a SELECT statement. The structure of a SELECT statement is this: -
|
||||
|
||||
|
||||
SELECT data_we_want_to_retrieve
|
||||
FROM table
|
||||
WHERE some_condition
|
||||
|
||||
|
||||
So to retrieve all the data from the staff table, we would run: -
|
||||
|
||||
SELECT *
|
||||
FROM staff
|
||||
|
||||
The * indicates we want to retrieve all the columns from the table.
|
||||
|
||||
If we wanted to only retrieve staff members called “Mike” we would run: -
|
||||
|
||||
SELECT *
|
||||
FROM staff
|
||||
WHERE first_name = ‘Mike’
|
||||
|
||||
OK, now let’s look at joining two tables together in the SELECT statement.
|
||||
|
||||
Here is the relationship between the staff and address tables: -
|
||||
|
||||

|
||||
|
||||
From the Entity Relational Diagram (ERD), which is a method of displaying tables in a relational database, we can see that the tables are joined on the address_id column.
|
||||
|
||||
The address_id column is a primary key in the address table and a foreign key in the staff table.
|
||||
|
||||
We can also see (by looking at the join) that this is a many-to-one relationship…aka rows in the address table can be linked to more than one row in the staff table.
|
||||
|
||||
Makes sense as more than one member of staff could have the same address.
|
||||
|
||||
Ok, in order to retrieve data from both the staff and address tables we join them in our SELECT statement: -
|
||||
|
||||
SELECT *
|
||||
FROM staff s
|
||||
INNER JOIN address a ON s.address_id = a.address_id
|
||||
|
||||
That will retrieve all the rows from the staff table and also all the corresponding rows from the address table…aka we have retrieved all staff members and their addresses.
|
||||
|
||||
Let’s limit the query a little. Let’s just retrieve some data from the staff table and some from the address table for one staff member
|
||||
|
||||
SELECT s.first_name, s.last_name, a.address, a.district, a.phone
|
||||
FROM staff s
|
||||
INNER JOIN address a ON s.address_id = a.address_id
|
||||
WHERE first_name = ‘Mike’
|
||||
|
||||
Here we have only retrieved the name of any staff member called Mike and their address.
|
||||
|
||||
You may have noticed that when joining the address table to the staff table we used an <b>INNER JOIN</b>.
|
||||
|
||||
This is a type of join that specifies only to retrieve rows in the staff table that has a corresponding row in the address table.
|
||||
|
||||
The other types of joins are: -
|
||||
|
||||
<b>LEFT OUTER JOIN</b> - this would retrieve data in the staff table even if there was no row in the address table
|
||||
|
||||
<b>RIGHT OUTER JOIN</b> - this would retrieve data in the address table even if there was no row in the staff table
|
||||
|
||||
<b>FULL OUTER JOIN</b> - this would retrieve all data from the tables even if there was no corresponding matching row in the other table
|
||||
|
||||
If we run: -
|
||||
|
||||
SELECT *
|
||||
FROM staff s
|
||||
RIGHT OUTER JOIN address a ON s.address_id = a.address_id
|
||||
|
||||
We will get all the rows in the address table that do not have a corresponding row in the staff table.
|
||||
|
||||
But, if we run: -
|
||||
|
||||
SELECT *
|
||||
FROM staff s
|
||||
LEFT OUTER JOIN address a ON s.address_id = a.address_id
|
||||
|
||||
We will still only get records in the staff table that have a record in the address table.
|
||||
|
||||
<br>
|
||||
|
||||
# Inserting data
|
||||
|
||||
This is due to the FOREIGN KEY constraint linking the two tables, if we tried to insert a row into the staff table and the address_id we specified did not exist in the address table we would get an error: -
|
||||
|
||||
ERROR: insert or update on table "staff" violates foreign key constraint "staff_address_id_fkey"
|
||||
|
||||
This is because the foreign key is saying that a record in the staff table must reference a valid row in the address table.
|
||||
|
||||
Enforcing this relationship is enforcing the referential integrity of the database…i.e. - maintaining consistent and valid relationships between the data in the tables.
|
||||
|
||||
So we have to add a row to the staff table that references an existing row in the address table.
|
||||
|
||||
So an example of this would be: -
|
||||
|
||||
INSERT INTO staff(
|
||||
staff_id, first_name, last_name, address_id,
|
||||
email, store_id, active, username, password, last_update, picture)
|
||||
VALUES
|
||||
(999, 'Andrew', 'Pruski', 1, 'andrew.pruski@90daysofdevops.com',
|
||||
'2', 'T', 'apruski', 'Testing1122', CURRENT_DATE, '');
|
||||
|
||||
Notice that we specify all the columns in the table and then the corresponding values.
|
||||
|
||||
To verify that the row has been inserted: -
|
||||
|
||||
SELECT s.first_name, s.last_name, a.address, a.district, a.phone
|
||||
FROM staff s
|
||||
INNER JOIN address a ON s.address_id = a.address_id
|
||||
WHERE first_name = 'Andrew'
|
||||
|
||||
And there is our inserted row!
|
||||
|
||||
<br>
|
||||
|
||||
# Updating data
|
||||
|
||||
To update a row in a table we use a statement in the format: -
|
||||
|
||||
UPDATE table
|
||||
SET column = new_value
|
||||
WHERE some_condition
|
||||
|
||||
OK, now let’s update the row that we inserted previously. Say the staff member’s email address has changed. To view the current email address: -
|
||||
|
||||
SELECT s.first_name, s.last_name, s.email
|
||||
FROM staff s
|
||||
WHERE first_name = 'Andrew'
|
||||
|
||||
And we want to change that email value to ‘andrewxpruski@outlook.com’. To update that value: -
|
||||
|
||||
UPDATE staff
|
||||
SET email = 'apruski@90daysofdevops.com'
|
||||
WHERE first_name = 'Andrew'
|
||||
|
||||
You should see that one row has been updated in the output. To confirm run the SELECT statement again: -
|
||||
|
||||
SELECT s.first_name, s.last_name, s.email
|
||||
FROM staff s
|
||||
WHERE first_name = 'Andrew'
|
||||
|
||||
<br>
|
||||
|
||||
# Deleting data
|
||||
|
||||
To delete a row from a table we use a statement in the format: -
|
||||
|
||||
DELETE FROM table
|
||||
WHERE some_condition
|
||||
|
||||
So to delete the row we inserted and updated previously, we can run: -
|
||||
|
||||
DELETE FROM staff
|
||||
WHERE first_name = ‘Andrew’
|
||||
|
||||
You should see that one row was deleted in the output. To confirm: -
|
||||
|
||||
SELECT s.first_name, s.last_name, s.email
|
||||
FROM staff s
|
||||
WHERE first_name = 'Andrew'
|
||||
|
||||
No rows should be returned.
|
||||
|
||||
<br>
|
||||
|
||||
# Creating tables
|
||||
|
||||
Let’s have a look at the definition of the staff table. This can be scripted out by right-clicking on the table, then Scripts > CREATE Script
|
||||
|
||||
This will open a new query window and show the statement to create the table.
|
||||
|
||||
Each column will be listed with the following properties: -
|
||||
|
||||
Name - Data type - Constraints
|
||||
|
||||
If we look at the address_id column we can see: -
|
||||
|
||||
address_id smallint NOT NULL
|
||||
|
||||
So we have the column name, that it is a smallint data type (https://www.postgresql.org/docs/9.1/datatype-numeric.html), and that it cannot be null. It cannot be null as a FOREIGN KEY constraint is going to be created to link to the address table.
|
||||
|
||||
The columns that are a character datatype also have a COLLATE property. Collations specify case, sorting rules, and accent sensitivity properties. Each character datatype here is using the default setting, more information on collations can be found here: -
|
||||
https://www.postgresql.org/docs/current/collation.html
|
||||
|
||||
Other columns also have default values specified…such as the <b>last_update</b> column: -
|
||||
|
||||
last_update timestamp without time zone NOT NULL DEFAULT 'now()'
|
||||
|
||||
This says that if no value is set for the column when a row is inserted, the current time will be used.
|
||||
|
||||
Then we have a couple of constraints defined on the table. Firstly a primary key: -
|
||||
|
||||
CONSTRAINT staff_pkey PRIMARY KEY (staff_id)
|
||||
|
||||
A primary key is a unique identifier in a table, aka this row can be used to identify individual rows in the table. The primary key on the address table, address_id, is used as a foreign key in the staff table to link a staff member to an address.
|
||||
|
||||
The foreign key is also defined in the CREATE TABLE statement: -
|
||||
|
||||
CONSTRAINT staff_address_id_fkey FOREIGN KEY (address_id)
|
||||
REFERENCES public.address (address_id) MATCH SIMPLE
|
||||
ON UPDATE CASCADE
|
||||
ON DELETE RESTRICT
|
||||
|
||||
We can see here that the address_id column in the staff table references the address_id column in the address table.
|
||||
|
||||
The ON UPDATE CASCADE means that if the address_id in the address table is updated, any rows in the staff table referencing it will also be updated. (Note - it’s very rare that you would update a primary key value in a table, I’m including this here as it’s in the CREATE TABLE statement).
|
||||
|
||||
The ON DELETE RESTRICT prevents the deletion of any rows in the address table that are referenced in the staff table. This prevents rows in the staff table having references to the rows in the address table that are no longer there…protecting the integrity of the data.
|
||||
|
||||
OK so let’s create our own table and import some data into it: -
|
||||
|
||||
CREATE TABLE test_table (
|
||||
id smallint,
|
||||
first_name VARCHAR(50),
|
||||
last_name VARCHAR(50),
|
||||
dob DATE,
|
||||
email VARCHAR(255),
|
||||
CONSTRAINT test_table_pkey PRIMARY KEY (id)
|
||||
)
|
||||
|
||||
NOTE - VARCHAR is an alias for CHARACTER VARYING which we saw when we scripted out the staff table
|
||||
|
||||
Ok, so we have a test_table with 6 columns and a primary key (the id column).
|
||||
|
||||
Let’s go and import some data into it. The docker image that we are using has a test_data.csv file in the /dvdrental directory and we can import that data with: -
|
||||
|
||||
COPY test_table(id,first_name, last_name, dob, email)
|
||||
FROM '/dvdrental/test_data.csv'
|
||||
DELIMITER ','
|
||||
CSV HEADER;
|
||||
|
||||
To verify: -
|
||||
|
||||
SELECT * FROM test_table
|
||||
|
||||
So that’s how to retrieve, update, and delete data from a database. We also looked at creating tables and importing data.
|
||||
|
||||
Join us tommorrow where we will be looking at backing up and restoring databases.
|
||||
|
||||
Thank you for reading!
|
253
2023/day65.md
@ -0,0 +1,253 @@
|
||||
# Backing up and restoring databases
|
||||
|
||||
Hello and welcome to the third post in the database part of the 90 Days of DevOps blog series! Today we’ll be talking about backing up and restoring databases.
|
||||
|
||||
One of the (if not the) most vital tasks a Database Administrator performs is backing up databases.
|
||||
|
||||
Things do go wrong with computer systems not to mention with the people who operate them 🙂 and when they do, we need to be able to recover the data.
|
||||
|
||||
This is where backups come in. There are different types of backups, and different types of databases perform their backups in different ways…but the core concepts are the same.
|
||||
|
||||
The core backup is the full backup. This is a backup of the entire database, everything, at a certain point in time. These backups are the starting point of a recovery process.
|
||||
|
||||
Then there are incremental/differential backups. Certain databases allow for this type of backup to be taken which only includes the data changes since the last full backup. This type of backup is useful when dealing with large databases and taking a full backup is a long process. In the recovery process, the full backup is restored first and then the applicable incremental/differential backup.
|
||||
|
||||
Another important type of backup is backing up the “log”. Databases have a log of transactions that are executed against the data stored. Typically the log is written to before any data changes are made so that in the event of a failure, the log can be “replayed” (aka rolling forward any committed transactions in the log, and rolling back any uncommitted transactions) so that the database comes back online in a consistent state.
|
||||
|
||||
Log backups allow database administrators to achieve point-in-time recovery. By restoring the full backup, then any incremental/differential backups, and then subsequential log backups, the DBA can roll the database to a certain point in time (say before a data loss event to recover data).
|
||||
|
||||
Backups are kept separate from the server that hosts the databases being backed up. You don’t want a server to go down and take its database backups with it! Typically backups will be stored in a centralised location and then shipped to a 3rd site (just in case the whole primary site goes down).
|
||||
|
||||
One motto of DBAs is “It’s not a backup process, it’s a recovery process” (or so Andrew says). Meaning that backing up the databases is useless if those backups cannot be restored easily. So DBAs will have a whole host of scripts ready to go if a database needs to be restored to make the process as painless as possible. You really don’t want to be scrabbling around looking for your backups when you need to perform a restore!
|
||||
|
||||
Let’s have a look at backing up and restoring a database in PostgreSQL.
|
||||
|
||||
<br>
|
||||
|
||||
# Setup
|
||||
|
||||
For the demos in this blog post we’ll be using the dvdrental database in the custom PostgreSQL docker image. To spin this image up, start docker, open a terminal, and run: -
|
||||
|
||||
docker run -d \
|
||||
--publish 5432:5432 \
|
||||
--env POSTGRES_PASSWORD=Testing1122 \
|
||||
--name demo-container \
|
||||
ghcr.io/dbafromthecold/demo-postgres:latest
|
||||
|
||||
Note - the image size is 497MB which may or may not be an issue depending on your internet connection
|
||||
|
||||
<br>
|
||||
|
||||
# Taking a full backup
|
||||
|
||||
Starting with the simplest of the backup types, the full backup. This is a copy (or dump) of the database into a separate file that can be used to roll the database back to the point that the backup was taken.
|
||||
|
||||
Let’s run through taking a backup of the *dvdrental* database in the PostgreSQL image.
|
||||
|
||||
Connect into the server via pgAdmin (server name is *localhost* and the password is *Testing1122*), right-click on the *dvdrental* database and select *Backup*…
|
||||
|
||||

|
||||
|
||||
Enter a directory and filename on your local machine to store the backup file (in this example, I’m using *C:\temp\dvdrental.backup*).
|
||||
|
||||
Then hit Backup! Nice and simple!
|
||||
|
||||
If we click on the Processes tab in pgAdmin, we can see the completed backup. What’s nice about this is that if we click on the file icon, it will give us a dialog box of the exact command executed, and a step-by-step log of the process run.
|
||||
|
||||

|
||||
|
||||
|
||||
The process used a program called pg_dump (https://www.postgresql.org/docs/current/backup-dump.html) to execute the backup and store the files in the location specified.
|
||||
|
||||
This is essentially an export of the database as it was when the backup was taken.
|
||||
|
||||
<br>
|
||||
|
||||
# Restoring the full backup
|
||||
|
||||
Ok, say that database got accidentally dropped (it happens!)…we can use the backup to restore it.
|
||||
|
||||
Open a query against the PostgreSQL database and run: -
|
||||
|
||||
DROP DATABASE dvdrental
|
||||
|
||||
Note - if that throws an error, run this beforehand: -
|
||||
|
||||
SELECT pg_terminate_backend(pg_stat_activity.pid)
|
||||
FROM pg_stat_activity
|
||||
WHERE pg_stat_activity.datname = 'dvdrental'
|
||||
AND pid <> pg_backend_pid();
|
||||
|
||||
OK, now we need to get the database back. So create a new database: -
|
||||
|
||||
CREATE DATABASE dvdrental_restore
|
||||
|
||||
Execute that command and then right-click on the newly created database in the left-hand menu, then select *Restore*…
|
||||
|
||||

|
||||
|
||||
Select the filename of the backup that we performed earlier, then click Restore.
|
||||
|
||||
Let that complete, refresh the database on the left-hand menu…and there we have it! Our database has been restored…all the tables and the data is back as it was when we took the backup!
|
||||
|
||||
As with the backup, we can see the exact command used to restore the database. If we click on the processes tab in pgAdmin, and then the file icon next to our restore, we will see: -
|
||||
|
||||

|
||||
|
||||
Here we can see that a program called pg_restore was used to restore the database from the backup file that we created earlier.
|
||||
|
||||
<br>
|
||||
|
||||
# Point in time restores
|
||||
|
||||
Now that we’ve run through taking a full backup and then restoring that backup…let’s have a look at performing a point-in-time restore of a database.
|
||||
|
||||
Full backups are great to get the data at a certain point…but we can only get the data back to the point when that full backup was taken. Typically full backups are run once daily (at the most) so if we had a data loss event several hours after that backup was taken…we would lose all data changes made since that backup if we just restored the full backup.
|
||||
|
||||
In order to recover the database to a point in time after the full backup was taken we need to restore additional backups to “roll” the database forward.
|
||||
|
||||
We can do this in PostgreSQL as PostgreSQL maintains a write ahead log (WAL) that records every change (transaction) made to the database. The main purpose of this log is that if the server crashes the database can be brought back to a consistent state by replaying the transactions in the log.
|
||||
|
||||
But this also means that we can archive the log (or WAL files) and use them to perform a point in time restore of the database.
|
||||
|
||||
Let’s run through setting up write ahead logging and then performing a point in time restore.
|
||||
|
||||
First thing to do is run a container with PostgreSQL installed: -
|
||||
|
||||
docker run -d \
|
||||
--publish 5432:5432 \
|
||||
--env POSTGRES_PASSWORD=Testing1122 \
|
||||
--name demo-container \
|
||||
ghcr.io/dbafromthecold/demo-postgres:latest
|
||||
|
||||
Jump into the container: -
|
||||
|
||||
docker exec -it -u postgres demo-container bash
|
||||
|
||||
In this container image there are two locations that we will use for our backups.
|
||||
*/postgres/archive/base* for the baseline backup and */Postgres/archive/wal* for the log archive.
|
||||
|
||||
Now we’re going to edit the *postgresql.conf* file to enable WAL archiving: -
|
||||
|
||||
vim $PGDATA/postgresql.conf
|
||||
|
||||
Drop the following lines into the config file: -
|
||||
|
||||
archive_mode = on
|
||||
archive_command = 'cp %p /postgres/archive/wal/%f'
|
||||
|
||||
- <b>archive_mode</b> - enables WAL archiving
|
||||
- <b>archive_command</b> - the command used to archive the WAL files (%p is replaced by the path name of the file to archive, and any %f is replaced by the file name).
|
||||
|
||||
Exit the container and restart to enable WAL archiving: -
|
||||
|
||||
docker container restart demo-container
|
||||
|
||||
OK, the next thing to do is take a base backup of the database cluster. Here we are using pg_basebackup (https://www.postgresql.org/docs/current/app-pgbasebackup.html) which is different from the command used to take a full backup as it backs up all the files in the database cluster.
|
||||
|
||||
Aka it’s a file system backup of all the files on the server whereas the full backup used pg_dump which is used to backup only one database.
|
||||
|
||||
Jump back into the container: -
|
||||
|
||||
docker exec -it -u postgres demo-container bash
|
||||
|
||||
And take the backup: -
|
||||
|
||||
pg_basebackup -D /postgres/archive/base
|
||||
|
||||
We will use the files taken in this backup as the starting point of our point in time restore.
|
||||
|
||||
To test our point in time restore, connect to the database dvdrental in pgAdmin (server is localhost and password is Testing1122), create a table, and import sample data (csv file is in the container image): -
|
||||
|
||||
CREATE TABLE test_table (
|
||||
id smallint,
|
||||
first_name VARCHAR(50),
|
||||
last_name VARCHAR(50),
|
||||
dob DATE,
|
||||
email VARCHAR(255),
|
||||
CONSTRAINT test_table_pkey PRIMARY KEY (id)
|
||||
)
|
||||
|
||||
COPY test_table(id,first_name, last_name, dob, email)
|
||||
FROM '/dvdrental/test_data.csv'
|
||||
DELIMITER ','
|
||||
CSV HEADER;
|
||||
|
||||
Confirm the data is in the test table: -
|
||||
|
||||
SELECT * FROM test_table
|
||||
|
||||

|
||||
|
||||
What we’re going to do now is simulate a data loss event. For example, an incorrect DELETE statement executed against a table that removes all the data.
|
||||
|
||||
So wait a few minutes and run (make a note of the time): -
|
||||
|
||||
DELETE FROM test_table
|
||||
|
||||
Ok, the data is gone! To confirm: -
|
||||
|
||||
SELECT * FROM test_table
|
||||
|
||||

|
||||
|
||||
We need to get this data back! So, jump back into the container: -
|
||||
|
||||
docker exec -it -u postgres demo-container bash
|
||||
|
||||
The first thing to do in the recovery process is create a recovery file in the location of our base backup: -
|
||||
|
||||
touch /postgres/archive/base/recovery.signal
|
||||
|
||||
This file will automatically get deleted when we perform our point in time restore.
|
||||
|
||||
Now we need to edit the *postgresql.conf* file to tell PostgreSQL to perform the recovery: -
|
||||
|
||||
vim $PGDATA/postgresql.conf
|
||||
|
||||
Add in the following to the top of the file (you can leave the WAL archiving options there): -
|
||||
|
||||
restore_command = 'cp /postgres/archive/wal/%f %p'
|
||||
recovery_target_time = '2023-02-13 10:20:00'
|
||||
recovery_target_inclusive = false
|
||||
data_directory = '/postgres/archive/base'
|
||||
|
||||
|
||||
- <b>restore_command</b> - this is the command to retrieve the archived WAL files
|
||||
- <b>recovery_target_time</b> - this is the time that we are recovering to (change for just before you executed the DELETE statement against the table)
|
||||
- <b>recovery_target_inclusive</b> - this specifies to stop the recovery before the specified recovery time
|
||||
- <b>data_directory</b> - this is where we point PostgreSQL to the files taken in the base backup
|
||||
|
||||
Now we need to tell PostgreSQL to switch to a new WAL file, allowing for the old one to be archived (and used in recovery): -
|
||||
|
||||
psql -c "select pg_switch_wal();"
|
||||
|
||||
Almost there! Jump back out of the container and then restart: -
|
||||
|
||||
docker container restart demo-container
|
||||
|
||||
If we check the logs of the container, we can see the recovery process: -
|
||||
|
||||
docker container logs demo-container
|
||||
|
||||

|
||||
|
||||
We can see at the end of the logs that we have one more thing to do, so, one more time, jump back into the container: -
|
||||
|
||||
docker exec -it -u postgres demo-container bash
|
||||
|
||||
And resume replay of the WAL files: -
|
||||
|
||||
psql -c "select pg_wal_replay_resume();"
|
||||
|
||||
OK, finally to confirm, in pgAdmin, connect to the dvdrental database and run: -
|
||||
|
||||
SELECT * FROM test_table
|
||||
|
||||

|
||||
|
||||
The data is back! We have successfully performed a point in time restore of our database!
|
||||
|
||||
Join us tomorrow where we will be talking about high availability and disaster recovery.
|
||||
|
||||
Thanks for reading!
|
209
2023/day66.md
@ -0,0 +1,209 @@
|
||||
# High availability and disaster recovery
|
||||
|
||||
|
||||
Hello and welcome to the fourth post in the database part of the 90 Days of DevOps blog series! Today we’ll be talking about high availability and disaster recovery.
|
||||
|
||||
One of the main jobs of a database administrator is to configure and maintain disaster recovery and high availability strategies for the databases that they look after. In a nutshell they boil down to: -
|
||||
|
||||
<b>Disaster recovery (DR)</b> - recovering databases in the event of a site outage.<br>
|
||||
<b>High availability (HA)</b> - ensuring databases stay online in the event of a server outage.
|
||||
|
||||
Let’s go through both in more detail.
|
||||
|
||||
<br>
|
||||
|
||||
# Disaster recovery
|
||||
|
||||
Database administrators are a paranoid bunch (Andrew nodding his head). It’s their job to think about how the database servers will fail and how best to recover from that failure.
|
||||
|
||||
Two main factors come into play when thinking about disaster recovery…
|
||||
|
||||
<b>RTO - Recovery Time Objective</b> - How long can the databases be offline after a failure?<br>
|
||||
<b>RPO - Recovery Point Objective</b> - How much data can be lost in the event of a failure?
|
||||
|
||||
Basically, RTO is how quickly do we need to get the databases online after a failure and RPO is can we lose any data in the event of a failure?
|
||||
|
||||
In the last post we talked about backing up and restoring databases…and backups could be good enough for a disaster recovery strategy. Now, there’s a load of caveats with that statement!
|
||||
|
||||
In the event of a site outage…can we easily and quickly restore all of the databases with the RTO and to the RPO? More often than not, for anyone looking after anything more than a couple of small databases, the answer is no.
|
||||
|
||||
So an alternate strategy would need to be put into place.
|
||||
|
||||
A common strategy is to have what is known as a “warm standby”. Another server is spun up in a different site to the primary server (could potentially be another private data centre or the cloud) and a method of pushing data changes to that server is put into place.
|
||||
|
||||
There’s a couple of methods of doing this…one is referred to as “log shipping”. A full backup of the database is restored to the disaster recovery server and then the logs of the database are “shipped” from the primary server and restored to the secondary.
|
||||
In the event of an outage on the primary site, the secondary databases are brought online and the applications pointed to that server.
|
||||
|
||||
This means that the databases can be brought online in a relatively short period of time but caution needs to be taken as there can be data loss with this method. It depends on how often the logs are being restored to the secondary…which is where the RPO comes into play.
|
||||
|
||||
The Database Administrator needs to ensure that the logs are shipped frequently enough to the secondary so that in the event of a primary site outage, the amount of data loss falls within the RPO.
|
||||
|
||||
Another method of keeping a warm standby is asynchronous replication or mirroring. In this method, the full backup of the database is restored as before but then transactions are sent to the secondary when they are executed against the primary server.
|
||||
|
||||
Again, data loss can occur with this method as there is no guarantee that the secondary is “up-to-date” with the primary server. The transactions are sent to the secondary and committed on the primary…with no waiting from the secondary to acknowledge that the transaction has been committed there. This means that the secondary can lag behind the primary…the amount of lag would be determined by the network connectivity between the primary and secondary sites, the amount of transactions hitting the primary, and the amount of data being altered.
|
||||
|
||||
<br>
|
||||
|
||||
# High availability
|
||||
|
||||
Disaster recovery strategies really do mean recovering from a “disaster”, typically the entire primary site going down.
|
||||
|
||||
But what if just one server goes down? We wouldn’t want to enact our DR strategy just for that one server…this is where high availability comes in.
|
||||
|
||||
High availability means that if a primary server goes down, a secondary server will take over (pretty much) instantly…with no data loss.
|
||||
|
||||
In this setup, a primary server and one or more secondary servers are set up in a group (or cluster). If the primary server has an issue…one of the secondaries will automatically take over.
|
||||
|
||||
There are various different methods of setting this up…PostgreSQL and MySQL have synchronous replication and this is the method we will focus on here.
|
||||
|
||||
Synchronous replication means that when a transaction is executed against the primary server, it is also sent to the secondaries, and the primary waits for acknowledgement from the secondaries that they have committed the transaction before committing it itself.
|
||||
|
||||
Now, this means that the network between the primary and secondaries has to be able to handle the amount of transactions and data that is being sent between all the servers in the cluster because if the secondaries take a long time to receive, commit, and acknowledge transactions the transactions on the primary will take longer as well.
|
||||
|
||||
Let’s have a look at setting up replication between two instances of PostgreSQL.
|
||||
|
||||
<br>
|
||||
|
||||
# Setting up replication for PostgreSQL
|
||||
|
||||
What we’re going to do here is spin up two containers running PostgreSQL and then get replication set up from the “primary” to the “secondary”.
|
||||
|
||||
One thing we’re not going to do here is configure the servers for automatic failover, i.e. - the secondary server taking over if there is an issue with the primary.
|
||||
|
||||
As noted in the PostgreSQL documentation (https://www.postgresql.org/docs/current/warm-standby-failover.html), PostgreSQL does not natively implement a system to provide automatic failover, external tools such as PAF (http://clusterlabs.github.io/PAF/) have to be used…so we’ll skip that here and just get replication working.
|
||||
|
||||
First thing to do is create a docker custom bridge network: -
|
||||
|
||||
docker network create postgres
|
||||
|
||||
This will allow our two containers to communicate using their names (instead of IP addresses)
|
||||
|
||||
Now we can run our first container on the custom network which is going to be our “primary” instance: -
|
||||
|
||||
docker run -d
|
||||
--publish 5432:5432
|
||||
--network=postgres
|
||||
--volume C:\temp\base:/postgres/archive/base
|
||||
--env POSTGRES_PASSWORD=Testing1122
|
||||
--name demo-container
|
||||
ghcr.io/dbafromthecold/demo-postgres
|
||||
|
||||
This container run statement is a little different than the ones used in the previous blog posts.
|
||||
|
||||
We’ve included: -
|
||||
|
||||
<b>--network=postgres</b> - this is the custom docker network that we’ve created<br>
|
||||
<b>-v C:\temp\base:/postgres/archive/base</b> - mounting a directory on our local machine to /postgres/archive/base in the container. This is where we will store the base backup for setting up the secondary. Change the location based on your local machine, I’m using C:\temp\base in this example.
|
||||
|
||||
Now exec into the container: -
|
||||
|
||||
docker exec -it -u postgres demo-container bash
|
||||
|
||||
We need to update the pg_hba.conf file to allow connections to our secondary instance: -
|
||||
|
||||
vim $PGDATA/pg_hba.conf
|
||||
|
||||
Add in the following lines to the top of the file: -
|
||||
|
||||
# TYPE DATABASE USER ADDRESS METHOD
|
||||
host replication replicator 172.18.0.1/24 trust
|
||||
|
||||
<b>172.18.0.1/24</b> is the address range of containers on the custom network. If you have other custom docker networks this will change (confirm the address of the primary container with docker container inspect demo-container)
|
||||
|
||||
OK, connect to the primary container in pgAdmin (server is *localhost* and password is *Testing1122*) and create a user for replication: -
|
||||
|
||||
CREATE USER replicator WITH REPLICATION ENCRYPTED PASSWORD 'Testing1122';
|
||||
|
||||
Then create a slot for replication: -
|
||||
|
||||
SELECT * FROM pg_create_physical_replication_slot('replication_slot_slave1');
|
||||
|
||||
N.B. - Replication slots provide an automated way to ensure that a primary server does not remove WAL files until they have been received by the secondaries. Aka they ensure that the secondaries remain up-to-date.
|
||||
|
||||
Confirm that the slot has been created: -
|
||||
|
||||
SELECT * FROM pg_replication_slots;
|
||||
|
||||

|
||||
|
||||
Now back in the container, we take a base backup: -
|
||||
|
||||
pg_basebackup -D /postgres/archive/base -S replication_slot_slave1 -X stream -U replicator -Fp -R
|
||||
|
||||
Alright, what’s happening here?
|
||||
|
||||
<b>-D /postgres/archive/base</b> - specify the location for the backup<br>
|
||||
<b>-S replication_slot_slave1</b> - specify the replication slot we created (N.B. - this uses out-of-date terminology which will hopefully be changed in the future)<br>
|
||||
<b>-X stream</b> - Include WAL files in backup (stream whilst the backup is being taken)<br>
|
||||
<b>-U replicator</b> - specify user<br>
|
||||
<b>-Fp</b> - specify format of the output (plain)<br>
|
||||
<b>-R</b> - creates the standby.signal file in the location of the directory (for setting up the standby server using the results of the backup)<br>
|
||||
|
||||
More information about these parameters can be found here: -
|
||||
https://www.postgresql.org/docs/current/app-pgbasebackup.html
|
||||
|
||||
Now we are ready to create our secondary container.
|
||||
|
||||
docker run -d
|
||||
--publish 5433:5432
|
||||
--network=postgres
|
||||
--volume C:\temp\base:/var/lib/postgresql/data
|
||||
--env POSTGRES_PASSWORD=Testing1122
|
||||
--name demo-container2
|
||||
ghcr.io/dbafromthecold/demo-postgres
|
||||
|
||||
Again this container run statement is a little different than before. We’re on the custom network (as with the first container) but we also have: -
|
||||
|
||||
<b>-p 5433:5432</b> - changing the port that we connect to the instance on as we already have our primary container on port 5432.<br>
|
||||
<b>-v C:\temp\base:/var/lib/postgresql/data</b> - this is saying to use the directory that we stored our base backup as the data location for the postgres instance in the secondary. We’re doing this so we don’t have to copy the base backup into the secondary container and change the default data directory.
|
||||
|
||||
Once the secondary is running, jump into it: -
|
||||
|
||||
docker exec -it -u postgres demo-container2 bash
|
||||
|
||||
And open the postgresql.auto.conf file: -
|
||||
|
||||
vim $PGDATA/postgresql.auto.conf
|
||||
|
||||
Here we are going to add in information about the primary container. Replace the *primary_conninfo* line with: -
|
||||
|
||||
primary_conninfo = 'host=demo-container port=5432 user=replicator password=Testing1122'
|
||||
|
||||
Exit out of the container and restart both the primary and secondary: -
|
||||
|
||||
docker container restart demo-container demo-container2
|
||||
|
||||
We’re now ready to test replication from the primary container to the secondary! Connect to the *dvdrental* database in the primary container in pgAdmin (server is *localhost* and password is *Testing1122*).
|
||||
|
||||
Create a test table and import some data: -
|
||||
|
||||
CREATE TABLE test_table (
|
||||
id smallint,
|
||||
first_name VARCHAR(50),
|
||||
last_name VARCHAR(50),
|
||||
dob DATE,
|
||||
email VARCHAR(255),
|
||||
CONSTRAINT test_table_pkey PRIMARY KEY (id)
|
||||
)
|
||||
|
||||
COPY test_table(id,first_name, last_name, dob, email)
|
||||
FROM '/dvdrental/test_data.csv'
|
||||
DELIMITER ','
|
||||
CSV HEADER;
|
||||
|
||||
Then connect to the dvdrental database in pgAdmin in the secondary container (server name and password are the same as the primary container but change the port to *5433*).
|
||||
|
||||
Run the following to check the data: -
|
||||
|
||||
SELECT * FROM test_table
|
||||
|
||||

|
||||
|
||||
And there’s the data. We have successfully configured replication between two instances of PostgreSQL!
|
||||
|
||||
You can further test this by deleting the data on the primary and querying the data on the secondary.
|
||||
|
||||
Join us tomorrow where we'll be talking about performance tuning.
|
||||
|
||||
Thanks for reading!
|
||||
|
138
2023/day67.md
@ -0,0 +1,138 @@
|
||||
# Performance tuning
|
||||
|
||||
Hello and welcome to the fifth post in the database part of the 90 Days of DevOps blog series! Today we’ll be talking about performance tuning.
|
||||
|
||||
Performance tuning is a massive area in the database field. There are literally thousands of books, blog posts, videos, and conference talks on the subject. People have made (and are still making) careers out of performance tuning databases.
|
||||
|
||||
We’re not going to cover everything here, it’s pretty much impossible to do in one blog post so what we’ll do is talk about the main areas of focus when it comes to ensuring that the database systems we are looking after hits the performance target required.
|
||||
|
||||
<br>
|
||||
|
||||
# Server performance tuning
|
||||
|
||||
Andrew always tells people to know their environment completely when it comes to approaching performance tuning.
|
||||
|
||||
This means we need to start off by looking at the hardware our database is running on.
|
||||
|
||||
This used to be “relatively” simple. We had physical servers attached to storage where we would install an OS, and then install our database engine. Here we’re concerned with the specifications of the server, CPU, Memory, and Storage.
|
||||
|
||||
<b>CPU</b> - does the server have enough compute power to handle the amount of transactions the database engine will be executing?
|
||||
|
||||
<b>Memory</b> - database systems cache data in memory to perform operations (certain ones work entirely in memory - Redis for example). Does the server have enough memory to handle the amount of data that it’ll be working with?
|
||||
|
||||
<b>Storage</b> - is the storage available to the server fast enough so that when data is requested from disk it can server up that data with minimal latency?
|
||||
|
||||
Nowadays the most common setup for database servers is running on a virtual machine. A physical machine is carved up into virtual machines in order to make resource usage more efficient, improve manageability, and reduce cost.
|
||||
|
||||
But this means that we have another layer to consider when looking at getting the maximum performance out of our servers.
|
||||
|
||||
Not only do we have the same areas to look at with physical machines (CPU, Memory, Storage) but we also now have to consider the host that the virtual machine is running on.
|
||||
Does that have enough resources to handle the traffic that the virtual machine will be running? What other virtual machines are on the host that our database server is on? Is the host oversubscribed (i.e. - the virtual machines on the host have more resources assigned to them than the actual physical host has)?
|
||||
|
||||
Running database engines in containers is now becoming more popular (as we have been doing in the demos for this series). However just running a database server in one container can lead to issues (see this series’ post on high availability). For production workloads, a container orchestrator is used. There are a few types out there but the main one that has come to the front is Kubernetes.
|
||||
|
||||
So this means that we have a whole bunch of other considerations when thinking about performance for our database engine.
|
||||
|
||||
What spec are the hosts in the Kubernetes cluster? Will they have enough resources to handle the traffic of our database engine? What else is running on that cluster? Have we got the correct setting in our deployment manifest for our database engine?
|
||||
|
||||
Knowing your environment completely is the first step in building a database server that will perform to the required standard. Once we know we have built a server that can handle the transactions hitting our databases we can move onto the next step, tuning the database engine itself.
|
||||
|
||||
<br>
|
||||
|
||||
# Database engine performance tuning
|
||||
|
||||
Database systems come with a huge variety of different settings that can be used to tune them. Now there are settings that Database Administrators will 100% say that need to be altered for all workloads, no matter what they are, but then there are others that depend on the workload itself.
|
||||
|
||||
For example, the memory available to the database engine may not be enough for the workload that the system will be dealing with, in which case it will need to be increased. Or conversely, it may not be limited at all…which would allow the database engine to consume all the memory on the server, starving the OS and leading to issues…so that would need to be limited.
|
||||
|
||||
Getting the right configuration settings can be a daunting process…especially anyone new to the particular database system being tuned. This is where development environments come into play…having a development environment that is (somewhat) similar to a production environment allows Database Administrators to make configuration changes and monitor.
|
||||
|
||||
Ok, the challenge here is that typically, development environments do not get the throughput that production environments do, and the databases are smaller as well.
|
||||
|
||||
To get around this, there are a host of tools out there that can simulate workload activity. A DBA would run a tool to get a baseline of the performance of the system, make some configuration changes, then run the tool again to see what (if any 🙂 ) the increase in performance is.
|
||||
|
||||
Once the database engine is configured we can then move onto the next area of performance tuning, query performance.
|
||||
|
||||
<br>
|
||||
|
||||
# Query performance tuning
|
||||
|
||||
Even with a powerful server and properly configured database engine, query performance can still be poor.
|
||||
|
||||
Thankfully there are a host of tools that can capture queries hitting databases and report on their performance. If a particular query starts suffering, a DBA needs to go and analyse what has gone wrong.
|
||||
|
||||
When a query hits a database, an execution plan for that query is generated…an execution plan is how the data will be retrieved from the database.
|
||||
|
||||
The plan is generated from statistics stored in the database which could potentially be out of date. If they are out of date then the plan generated will result in the query being inefficient.
|
||||
|
||||
For example…if a large dataset has been inserted into a table, the statistics for that table may not have been updated, resulting in any queries to that table then having an inefficient plan generated as the database engine does not know about the new data that has been inserted.
|
||||
|
||||
Another key factor when it comes to query performance is indexing. If a query hits a large table and does not have a supporting index, it will scan through every row in the table until it finds the row required…not an efficient way to retrieve data.
|
||||
|
||||
Indexes solve this problem by pointing the query to the correct row in the table. They are often described as the index of a book. Instead of a reader going through each page in a book to find what they need, they simply go to the index, find the entry in the index that points them to the page in the book they are looking for, and then go straight to that page.
|
||||
|
||||
So the key questions a Database Administrator will ask when troubleshooting query performance are…are the statistics up to date? Are there any supporting indexes for this query?
|
||||
|
||||
It can be tempting to add indexes to a table to cover all queries hitting it, however when data in a table is updated, any indexes on that table need to be updated as well so there is a performance hit on INSERT/UPDATE/DELETE queries. It’s all about finding the correct balance.
|
||||
|
||||
Let’s have a look at indexes in the *dvdrental* database.
|
||||
|
||||
Run a container: -
|
||||
|
||||
docker run -d \
|
||||
--publish 5432:5432 \
|
||||
--env POSTGRES_PASSWORD=Testing1122 \
|
||||
--name demo-container \
|
||||
ghcr.io/dbafromthecold/demo-postgres:latest
|
||||
|
||||
Connect to the dvdrental database in pgAdmin (server is *localhost* and password is *Testing1122*). Open a new query window and run the following: -
|
||||
|
||||
SELECT * FROM actor WHERE last_name = ‘Cage’
|
||||
|
||||
We can see that 200 rows are returned. If we want to see the execution plan being used we can hit the Explain button: -
|
||||
|
||||

|
||||
|
||||
Here we can see that the plan is really simple. We have just one operation, a scan on the actor table.
|
||||
|
||||
However, if we look at the actor table in the left hand side menu, there is an index on the last_name column!
|
||||
|
||||

|
||||
|
||||
So why didn’t the query use that index?
|
||||
|
||||
This is due to the size of the table…it only has 200 rows so the database engine decided that a full table scan would be more efficient than doing an index lookup. This is just one of the very many nuances of query performance!
|
||||
|
||||
Let’s force PostgreSQL to use that index. Run the following: -
|
||||
|
||||
SET enable_seqscan=’false’
|
||||
|
||||
NOTE - this setting is just for developing queries to see if a particular query would use an index on a large dataset. Don’t go doing this in a production environment!
|
||||
|
||||
Then highlight the SELECT statement and hit the Explain button: -
|
||||
|
||||

|
||||
|
||||
And there we can see that the query now is using the index and then going back to the table! So if the dataset here was larger, we know that we have an index to support that query.
|
||||
|
||||
Ok, but what about querying on first_name in the actor table: -
|
||||
|
||||
SELECT * FROM actor WHERE first_name = 'Nick'
|
||||
|
||||

|
||||
|
||||
Here we can see that we’re back to the table scan. There’s no supporting index on the first_name column!
|
||||
|
||||
Let’s create one: -
|
||||
|
||||
CREATE INDEX idx_actor_first_name ON public.actor (first_name)
|
||||
|
||||
Now explain the SELECT statement on first_name again: -
|
||||
|
||||

|
||||
|
||||
And there we have it! Our query now has a supporting index!
|
||||
|
||||
Join us tommorrow where we'll be talking about database security.
|
||||
|
||||
Thanks for reading!
|
207
2023/day68.md
@ -0,0 +1,207 @@
|
||||
# Database security
|
||||
|
||||
Hello and welcome to the sixth post in the database part of the 90 Days of DevOps blog series! Today we’ll be talking about database security.
|
||||
|
||||
Controlling access to data is an incredibly important part of being a Database Administrator. DBAs need to prevent unauthorised access to the data that they are responsible for. In order to do this all the different levels of access to a database, starting with the server, then the database engine, and then the data itself need to be controlled.
|
||||
|
||||
Let’s go through each layer of security.
|
||||
|
||||
<br>
|
||||
|
||||
# Server security
|
||||
|
||||
The first area to look at when it comes to database security is who has access to the server that the database instance is running on.
|
||||
|
||||
Typically the System Administrators will have admin access to the server along with the Database Administrators. Now, DBAs won’t like this…but do they really need admin access to the servers? It’s a toss up between who will support any server issues…if that’s solely down to the sysadmins, then the DBAs do not need admin access to the servers (Andrew’s eye is twitching :-) ).
|
||||
|
||||
Next thing to consider is the account that the database service is running under. Andrew has seen multiple installs of SQL Server where the account the database engine runs under is a local admin on the server and worse…the same account is used on multiple servers…all with admin access!
|
||||
|
||||
DBAs do this so that they don’t run into any security issues when the database engine tries to access resources on the server, but it’s not secure.
|
||||
|
||||
Database services should not run under an admin account. The account they run under should only have access to the resources that it needs. Permissions should be explicitly granted to that account and monitored.
|
||||
|
||||
The reason for this is that if the account becomes compromised, it does not have full access to the server. Imagine if an admin account that was used for multiple servers was compromised, that would mean all the servers that used that account would be vulnerable!
|
||||
|
||||
Individual accounts for each database service should be used with only the permissions required granted.
|
||||
|
||||
That way if one is compromised, only that server is affected and we can be (fairly) confident that the other servers in our environment are not at risk.
|
||||
|
||||
<br>
|
||||
|
||||
# Database security
|
||||
|
||||
The next level of security is who can access the databases on the server? Each database engine will have another level of security on top of who can access the server itself.
|
||||
|
||||
Certain database engines will allow local administrators of the server to have full access to the databases, this needs to be disabled. The reason for this is that if the server becomes compromised then access to the databases isn’t automatically granted.
|
||||
|
||||
Database Administrators need to work out who needs access to the databases on the server and what level of access they should be given. For instance, the System Administrators have full access to the server but do they need full access to the databases? More often than not, no they don’t.
|
||||
|
||||
Not only do DBAs need to work out who needs access but also what needs access. There will be application accounts that need to retrieve data from the databases. There could also be reporting and monitoring tools that need access.
|
||||
|
||||
Application accounts should only have access to the databases they require and reporting/monitoring tools may need access to all the databases on the server but only require read-only access. Furthermore, applications and tools may only need access to certain tables within each database so DBAs would need to restrict access even further.
|
||||
|
||||
Again the reason for this is that if an account becomes compromised, the damage is limited.
|
||||
|
||||
Let’s have a look at creating a custom user in PostgreSQL and assigning access rights.
|
||||
|
||||
<br>
|
||||
|
||||
# Creating a custom user in PostgreSQL
|
||||
|
||||
PostgreSQL uses the concept of roles to manage database access. A role can be either a user or a group of users and have permissions assigned to it. Then roles can be granted membership to other roles.
|
||||
|
||||
The concept of roles replaces users and groups in PostgreSQL but for simplicity here what we’re going to do is create a new user and grant it membership to a pre-defined role.
|
||||
|
||||
Spin up a container running PostgreSQL: -
|
||||
|
||||
docker run -d \
|
||||
--publish 5432:5432 \
|
||||
--env POSTGRES_PASSWORD=Testing1122 \
|
||||
--name demo-container \
|
||||
ghcr.io/dbafromthecold/demo-postgres
|
||||
|
||||
Connect to PostgreSQL in pgAdmin (server is *localhost* and password is *Testing1122*). Open a query window and run the following to view the existing users: -
|
||||
|
||||
SELECT usename FROM pg_user;
|
||||
|
||||

|
||||
|
||||
As we can see, there is only one user at the moment. This is the default postgres user that has admin rights. We don’t want anyone else using this account so let’s set up a new one.
|
||||
|
||||
To create a new custom user: -
|
||||
|
||||
CREATE USER test_user WITH PASSWORD 'Testing1122'
|
||||
|
||||
OK, confirm that the new user is there: -
|
||||
|
||||
SELECT usename FROM pg_user;
|
||||
|
||||

|
||||
|
||||
Great! Our user is there, now we need to assign some permissions to that user. There are default roles within PostgreSQL that can be used to assign permissions. To view those roles: -
|
||||
|
||||
SELECT groname FROM pg_group;
|
||||
|
||||

|
||||
|
||||
For more information on these roles: -
|
||||
https://www.postgresql.org/docs/current/predefined-roles.html
|
||||
|
||||
Now, these are default roles. They may be OK for our user but they also might grant more permissions than needed. For a production instance of PostgreSQL, custom roles can (should) be created that only grant the exact permissions needed for an account. But for this demo, we’ll use one of the defaults.
|
||||
|
||||
Grant read to the custom user: -
|
||||
|
||||
GRANT pg_read_all_data TO test_user;
|
||||
|
||||
Log into the container in pgAdmin with the custom users credentials, connect to the dvdrental database and open the query tool.
|
||||
|
||||
Try running a SELECT statement against the actor table in the database: -
|
||||
|
||||
SELECT * FROM actor
|
||||
|
||||

|
||||
|
||||
The data is returned as the user has access to read from any table in the database. Now try to update the data: -
|
||||
|
||||
UPDATE actor SET last_name = 'TEST'
|
||||
|
||||

|
||||
|
||||
An error is returned as the user does not have write access to any table in the database.
|
||||
|
||||
Database Administrators must always ensure that users/applications only have the access that they need to the databases and within PostgreSQL, roles are how that is achieved.
|
||||
|
||||
<br>
|
||||
|
||||
# Data encryption
|
||||
|
||||
The next level of security that needs to be considered is data encryption. There are different levels of encryption that can be applied to data. First option is to encrypt the entire database.
|
||||
|
||||
If someone managed to gain access to the server, they could copy the database files off-site, and then try to gain access to the data itself.
|
||||
|
||||
By encrypting part or all of the database, without the relevant encryption keys, an attacker would not be able (or be very unlikely to) gain access to the data.
|
||||
|
||||
If not all the data in a database is sensitive, then only certain columns within a database can be encrypted. For example, when storing login details for users in a database, the password for those users should (at a minimum) be encrypted.
|
||||
|
||||
Then we also need to consider how the data is being accessed. Any application accessing sensitive data should be using a secure connection. There’s no point in having the data encrypted in the database and then having it being sent across the network decrypted!
|
||||
|
||||
Another area to consider encryption is backups. An attacker would not have to target the database server to gain access to the data, they could attempt to gain access to where the database backups are stored. If they gain that access, all they have to do is copy off-site and restore the backups.
|
||||
|
||||
Andrew would always, 100%, advise that database backups are encrypted. When it comes to encryption of the online databases…there can be a performance penalty to pay…so it really comes down to how sensitive the data is.
|
||||
|
||||
Let’s have a look at encrypting data within PostgreSQL.
|
||||
|
||||
<br>
|
||||
|
||||
# Encrypting a column in PostgreSQL
|
||||
|
||||
What we’re going to do here is create a table that has a column that will contain sensitive data. We’ll import some data as we have done in the previous posts and then encrypt the sensitive column.
|
||||
|
||||
Run a container from the custom image: -
|
||||
|
||||
docker run -d
|
||||
--publish 5432:5432
|
||||
--env POSTGRES_PASSWORD=Testing1122
|
||||
--name demo-container
|
||||
ghcr.io/dbafromthecold/demo-postgres:latest
|
||||
|
||||
Connect to the dvdrental database in pgAdmin (server is *localhost* and password is *Testing1122*)
|
||||
|
||||
Install the pgcrypto extension: -
|
||||
|
||||
CREATE EXTENSION pgcrypto;
|
||||
|
||||
For more information on the pgcrypto extension: -
|
||||
https://www.postgresql.org/docs/current/pgcrypto.html
|
||||
|
||||
Now create a test table: -
|
||||
|
||||
CREATE TABLE test_table (
|
||||
id smallint,
|
||||
first_name VARCHAR(50),
|
||||
last_name VARCHAR(50),
|
||||
dob DATE,
|
||||
email VARCHAR(255),
|
||||
passwd VARCHAR(255),
|
||||
CONSTRAINT test_table_pkey PRIMARY KEY (id)
|
||||
)
|
||||
|
||||
And import the sample data (included in the container image): -
|
||||
|
||||
COPY test_table(id,first_name, last_name, dob, email)
|
||||
FROM '/dvdrental/test_data.csv'
|
||||
DELIMITER ','
|
||||
CSV HEADER;
|
||||
|
||||
Now we’re going to use pgp_sym_encrypt to add an encrypted password to the table for both entries: -
|
||||
|
||||
UPDATE test_table
|
||||
SET passwd = (pgp_sym_encrypt('Testing1122', ‘ENCRYPTIONPWD’))
|
||||
WHERE first_name = 'Andrew';
|
||||
|
||||
UPDATE test_table
|
||||
SET passwd = (pgp_sym_encrypt('Testing3344', ‘ENCRYPTIONPWD’))
|
||||
WHERE first_name = 'Taylor';
|
||||
|
||||
Note - here we are using a password to encrypt the data. There are many more options to encrypt data within PostgreSQL..see here for more information: -
|
||||
https://www.postgresql.org/docs/current/encryption-options.html
|
||||
|
||||
Now if we try to SELECT as usual from the table: -
|
||||
|
||||
SELECT first_name, last_name, passwd FROM test_table
|
||||
|
||||
We can only see the encrypted values: -
|
||||
|
||||

|
||||
|
||||
In order to view the encrypted data, we have to use pgp_sym_decrypt and the key that we set earlier: -
|
||||
|
||||
SELECT first_name, last_name, pgp_sym_decrypt(passwd::bytea, 'ENCRYPTIONPWD') FROM test_table
|
||||
|
||||

|
||||
|
||||
So if we have sensitive data within our database, this is one method of encrypting it so that it can only be accessed with the correct password.
|
||||
|
||||
Join us tomorrow for the final post in the database series of 90DaysOfDevOps where we'll be talking about monitoring and troubleshooting.
|
||||
|
||||
Thanks for reading!
|
164
2023/day69.md
@ -0,0 +1,164 @@
|
||||
# Monitoring and troubleshooting database issues
|
||||
|
||||
Hello and welcome to the seventh and final post in the database part of the 90 Days of DevOps blog series! Today we’ll be talking about monitoring and troubleshooting database issues.
|
||||
|
||||
Things can, and do, go wrong when looking after database servers and when they do it’s the job of the Database Administrator to firstly, get the databases back online, and THEN investigate the cause of the issue.
|
||||
|
||||
The number one priority is to get the databases back online and in a healthy state.
|
||||
|
||||
Once that has been achieved then the root cause of the issue can be investigated and once uncovered, fixes can be recommended.
|
||||
|
||||
There are many reasons that a server can run into issues…too many to list here! But they mainly fall into the following categories…issues with the hardware, the underlying OS, the database engine, and transactions (queries) hitting the databases.
|
||||
|
||||
It may not just be one factor causing issues on the server there may be many! The “death by 1000 cuts” scenario could be the cause, multiple small issues (typically frequently run queries that are experiencing a performance degradation) which overall result in the server going down or becoming so overloaded that everything grinds to a halt.
|
||||
|
||||
<br>
|
||||
|
||||
# Monitoring
|
||||
|
||||
The first step in troubleshooting database issues comes not when an actual issue has happened but when the database servers are operating normally.
|
||||
|
||||
We don’t want to be constantly reacting to (aka firefighting) issues…we want to be proactive and anticipate issues before they happen.
|
||||
|
||||
DBAs are trained to think about how servers will fail and how the systems that they look after will react when they encounter failures. Thinking about how the systems will fail plays a huge role when designing high availability and disaster recovery strategies.
|
||||
|
||||
However once in place, HA strategies are not infallible. There could be a misconfiguration that prevents the solution in place not reacting in the expected way. This means that HA (and DR!) strategies need to be regularly tested to ensure that they work as expected when they are needed!
|
||||
|
||||
We don’t want to be troubleshooting a failed HA solution at the same time as having to deal with the issue that caused the HA solution to kick in in the first place! (Andrew - we really don’t as these things never happen at 4pm on a Tuesday..for some reason it always seems to be 2am on a weekend! 🙂 )
|
||||
|
||||
In order to effectively anticipate issues before they happen we need to be monitoring the servers that we look after and have some alerting in place.
|
||||
|
||||
There are 100s (if not 1000s) of tools out there that we can use to monitor the servers we manage. There are paid options that come with support and require little configuration to set up and there are others that are free but will require more configuration and have no support.
|
||||
|
||||
It’s up to us to decide which monitoring tool we go for based on budget, availability (do we have time to spend on configuration?), and skillset (do we have the skills to config and maintain the tool?).
|
||||
|
||||
Once we have the tool(s) in place we then point at our serves and make sure that we are monitoring (as an example): -
|
||||
|
||||
- CPU Usage
|
||||
- Memory Usage
|
||||
- Disk Usage
|
||||
- Network throughput
|
||||
- Transactions per second
|
||||
- Database size
|
||||
|
||||
Note - This is not a definitive list of things to monitor, there are a tonne of other metrics to collect based on what system(s) are running on the server.
|
||||
|
||||
Having monitoring in place means that we can see what the “normal” state of the servers is and if anything changes, we can pin down the exact time that took place which is invaluable when investigating.
|
||||
|
||||
Certain tools can be hooked into other systems for example, in a previous role, Andrew’s monitoring tool was hooked into the deployment system. So when something was deployed to the database servers, there was a notification on the servers’ monitoring page.
|
||||
|
||||
So, if say the CPU on a particular server skyrocketed up to 100% usage, the DBAs could not only see when this occurred but what was deployed to that server around that time. Incredibly helpful when troubleshooting.
|
||||
|
||||
One thing to consider when setting up monitoring, is when do we want to be alerted? Do we want to be alerted after an issue has occurred or do we want to be alerted before?
|
||||
|
||||
For example, something is consuming more disk space than normal on a server and the disk is close to becoming completely full. Do we want to be alerted when the disk is full or when the disk is close to becoming full?
|
||||
|
||||
Now that seems like an obvious question but it’s slightly more tricky to get right than you would think. If set up incorrectly the monitoring tool will start outputting alerts like crazy and lead to what is known as “alert fatigue”.
|
||||
|
||||
Alert fatigue is when the DBAs are sent so many alerts that they start to ignore them. The monitoring tool is incorrectly configured and is sending out alerts that do not require immediate action so the DBAs take no steps to clear them.
|
||||
|
||||
This can lead to actual alerts requiring immediate action to be ignored and then lead to servers going down.
|
||||
|
||||
To prevent this alerts should only be sent to the DBAs that require immediate action. Now different alert levels can be set in most tools but again we need to be careful, no-one wants to look at a system with 1000s of “warnings”.
|
||||
|
||||
So a good monitoring tool is 100% necessary to prevent DBAs from spending their lives firefighting issues on the servers that they maintain.
|
||||
|
||||
<br>
|
||||
|
||||
# Log collection
|
||||
|
||||
Of course with all the best monitoring tools in the world, and proactive DBAs working to prevent issues, things can still go wrong and when they do the root cause needs to be investigated.
|
||||
|
||||
This generally means trawling through various logs to uncover the root cause.
|
||||
|
||||
Every database system will have an error log that can be used to investigate issues…this along with the logs of the underlying operating system are the first places to look when troubleshooting an issue (that’s been resolved! :-) ).
|
||||
|
||||
However, having to go onto a server to retrieve the logs is not the best way to investigate issues. Sure, if only one server has issues then it’s not so bad but what if more than one server was affected?
|
||||
|
||||
Do we really want to be remoting to each individual server to look at their logs?
|
||||
|
||||
What we need is a central location where logs can be collected from all the servers so that they can be aggregated and analysed.
|
||||
|
||||
This could be the same tool as our monitoring tool but it could be separate. What’s important is that we have somewhere that we collect the logs from our servers and they are then presented in an easily searchable format.
|
||||
|
||||
We can also place alerts on the logs so that if a known error occurs we can immediately investigate.
|
||||
|
||||
Let’s have a look at the PostgreSQL logs. Spin up a container: -
|
||||
|
||||
docker run -d \
|
||||
--publish 5432:5432 \
|
||||
--env POSTGRES_PASSWORD=Testing1122 \
|
||||
--name demo-container \
|
||||
ghcr.io/dbafromthecold/demo-postgres
|
||||
|
||||
We need to update the postgresql.conf file to write out to a log, so jump into the container: -
|
||||
|
||||
docker exec -it -u postgres demo-container bash
|
||||
|
||||
Open the file: -
|
||||
|
||||
vim $PGDATA/postgresql.conf
|
||||
|
||||
And add the following lines: -
|
||||
|
||||
logging_collector = on
|
||||
log_directory = log
|
||||
log_filename = ‘postgresql-%Y-%m-%d_%H%M%S.log’
|
||||
|
||||
Exit the container and restart: -
|
||||
|
||||
docker restart container demo-container
|
||||
|
||||
Connect in pgAdmin (server is *localhost* and password is *Testing1122*). Open a new query window and run the following: -
|
||||
|
||||
SELECT 1/0
|
||||
|
||||
This will generate an error: -
|
||||
|
||||

|
||||
|
||||
OK so we have a query hitting our database that is failing. We’re asked to investigate so the first place to start would be the logs.
|
||||
|
||||
Jump back into the container: -
|
||||
|
||||
docker exec -it -u postgres demo-container bash
|
||||
|
||||
And navigate to the log file: -
|
||||
|
||||
cd $PGDATA/log
|
||||
|
||||
Then view the file: -
|
||||
|
||||
cat postgresql-2023-02-24_110854.log
|
||||
|
||||

|
||||
|
||||
And there’s our error! We’ve configured our instance of PostgreSQL to log errors to a file which could then be collected and stored in a central location so if we had this issue in a production environment, we would not need to go onto the server to investigate.
|
||||
|
||||
So we’ve looked at monitoring our servers and collecting logs of errors, but what about query performance?
|
||||
|
||||
<br>
|
||||
|
||||
# Query performance
|
||||
|
||||
The far most common issue that DBAs are asked to investigate is poorly performing queries.
|
||||
|
||||
Now as mentioned in part 5 of this series, query performance tuning is a massive part of working with databases. People have made (and do make) whole careers out of this area! We’re not going to cover everything in one blog post so we will just briefly highlight the main areas to look at.
|
||||
|
||||
A proactive approach is needed here in order to prevent query performance degrading. The main areas to look in order to maintain query performance are query structure, indexes and statistics.
|
||||
|
||||
Is the query structure in a way to optimally retrieve data from the tables in the database? If not, how can it be rewritten to improve performance? (this is a massive area btw, one that takes considerable knowledge and skill).
|
||||
|
||||
When it comes to indexing, do the queries hitting the database have indexes to support them? If indexes are there, are they the most optimal? Have they become bloated (for example, containing empty pages due to data being deleted)?
|
||||
|
||||
In order to prevent issues with indexes, a maintenance schedule should be implemented (rebuilding on a regular basis for example).
|
||||
|
||||
It’s the same with statistics. Statistics in databases can be automatically updated but sometimes (say after a large data insert) they can become out of date resulting in bad plans being generated for queries. In this case DBAs could implement a maintenance schedule for the statistics as well.
|
||||
|
||||
And again the same with monitoring and log collection, there are a whole host of tools out there that can be used to track queries. These are incredibly useful as it gives the ability to see how a query’s performance has changed over a period of time.
|
||||
|
||||
Caution does need to be taken with some of these tools as they can have a negative effect on performance. Tracking every single query and collecting information on them can be a rather intensive operation!
|
||||
|
||||
So having the correct monitoring, log collection, and query tracking tools are vital when it comes to not only preventing issues from arising but allowing for quick resolution when they do occur.
|
||||
|
||||
And that’s it for the database part of the 90DaysOfDevOps blog series. We hope this has been useful…thanks for reading!
|
@ -0,0 +1,17 @@
|
||||
# What is Serverless?
|
||||
|
||||
The term "Serverless" has become quite the buzzword over these past few years, some folks today even arguing whether or not the term still holds weight. You may or may not have heard about serverless technology up to this point in your journey through development. But what exactly is serverless? What does it mean to design applications in a serverless manner? What constitutes a serverless service or offering? I hope to answer all these questions and more over this series of blog posts.
|
||||
|
||||
As an [AWS Hero](https://aws.amazon.com/developer/community/heroes/kristi-perreault/?did=dh_card&trk=dh_card) and a Principal Software engineer at a large Fortune 100 enterprise, I have been focused solely on serverless technologies and enablement over the last 3 years. I have spoken quite extensively about our [serverless journey](https://youtu.be/ciCz8dnDuEs), [what serverless means in a large enterprise](https://syntax.fm/show/484/supper-club-70-000-serverless-functions-with-kristi-perreault-of-liberty-mutual), and [how to be successful in a corporate setting](https://www.youtube.com/watch?v=ctdviJ2Ewio). Everything I have learned has been through experience, on-the-job. The [serverless definition](https://aws.amazon.com/serverless/) I resonate with most states that serverless is event-driven, your resources scale up and down without you needing to manage them, and follows an "only pay for what you use" model. Let's break this down a bit further:
|
||||
|
||||
- [Event-driven architecture](https://aws.amazon.com/event-driven-architecture/#:~:text=An%20event%2Ddriven%20architecture%20uses,on%20an%20e%2Dcommerce%20website.) is exactly how it sounds. You build your applications following the flow of your events. When x happens, I want to trigger y, to run service z. We write our serverless applications with this flow in mind.
|
||||
|
||||
- Automatically scaling up and down as your service demands is a key component of serverless functionality. I fault the name "serverless" here a quite a bit, because contrary to popular belief, serverless does in fact include servers - you just don't need to manage them in the same way you would with your on-premises ecosystem or other cloud resources. You still need to provision the resources you need, and some configuration is required, but gone are the days of estimating exactly how much storage and processing power you need - your cloud provider handles this for you. This frees the developer up for focusing more on business code, and less on physical infrastructure.
|
||||
|
||||
- With automatic scaling, this also lends you to only pay for exactly what you are using. You no longer need to buy and maintain a physical server you may only use to half its capacity, save for the one time of year your traffic hits its peak, for instance. You don't need to pay for all the storage and processing power you have to have "just in case" - you pay exactly for what you use, exactly when you need it. No more, no less.
|
||||
|
||||
I am a large proponent of serverless, and I believe these are huge benefits to adopting serverless, but that does not mean it is for everyone or every architecture. I talk quite a bit about the concept of "[serverless-first](https://www.liberty-it.co.uk/stories/articles/tomorrow-talks-how-build-serverless-first-developer-experience)" design, meaning that you approach every architecture in a serverless manner first, and if that is not the most optimal design, you move on to other solutions like containers, relational databases, reserved compute instances, and so on. Equally as important, especially in a large enterprise, is to evaluate your time constraints and areas of expertise. Serverless is not going to be for everyone, and depending on your background, there can be a large learning curve associated with adopting it. The trade off is worth it, but if you do not have the adequate time or drive to dedicate to this transformation, you will not be successful.
|
||||
|
||||
That being said, I hope to provide you with a strong starting point for the land of serverless. Over the next few days, we will be exploring serverless resources and services, from compute, to storage, to API design, and more. We will keep our discussions high-level, but I'll be sure to include relevant examples, resources, and further reading from other leading industry experts. No prerequisites are necessary, I just ask you approach each and every article with an open mind, continue to ask questions & provide feedback, and let's dive in!*
|
||||
|
||||
*As a quick disclaimer - as I am an AWS Serverless Hero, most of the examples and explanations I give will reference the AWS ecosystem since that is where my expertise is. Many of the AWS services and tools we will discuss have equivalents across Azure, GCP, or other tooling. I will do my best to call these out going forward. This is part of a series that will be covered here, but I also encourage you to follow along on [Medium](https://kristiperreault.medium.com/what-is-serverless-1b46a5ffa7b3) or [Dev.to](https://dev.to/aws-heroes/what-is-serverless-4d4p) for more.
|
@ -0,0 +1,25 @@
|
||||
Compute is one of the basic building blocks of building any application. What is your application aiming to accomplish? Where are you keeping your business logic? What are you _running_ and **how**?
|
||||
|
||||

|
||||
|
||||
The key compute resource everyone immediately associates serverless with is [AWS Lambda](https://aws.amazon.com/lambda/). I want to be clear here - AWS Lambda is a large part of the serverless ecosystem, BUT Lambda ≠ serverless. Just because you include a lambda function in your application does not automatically make you serverless. Likewise, just because you build a service completely out of lambda functions also does not mean you have a serverless application. Serverless is a mindset, a new way of thinking; there are tons of tools and services out there that will help you build serverless applications, but implementing them does not mean you have a serverless application. We need to be intentional with our design, and where & how we use these resources.
|
||||
|
||||
**AWS Lambda**
|
||||
If you're in Azure, your equivalent service is [Azure Functions](https://learn.microsoft.com/en-us/azure/azure-functions/). For Google, this is [Google Functions](https://cloud.google.com/functions) (yes, AWS just HAD to be different). Regardless of its name, all of these services fulfill the same purpose - a small compute building block to house your business logic code. An AWS Lambda function is simply the code you want to run, written in your language of choice (I preference Python, but Typescript and Java are popular options). In your [infrastructure code](https://learn.microsoft.com/en-us/devops/deliver/what-is-infrastructure-as-code#:~:text=Infrastructure%20as%20code%20(IaC)%20uses,load%20balancers%2C%20and%20connection%20topologies.), you specify some [lambda function basics](https://awscli.amazonaws.com/v2/documentation/api/latest/reference/lambda/create-function.html?highlight=create%20lambda%20function), like name, path to the business logic code, security role, and what runtime you're using, and optionally have the ability to control more parameters like timeout, concurrency, aliases, and more. Lambda even has built in integrations to other AWS services, such as [S3](https://aws.amazon.com/s3/) and [SQS](https://aws.amazon.com/sqs/) (we'll get to these) to make application development even easier. Additionally, [lambda functions are priced](https://aws.amazon.com/lambda/pricing/) based on the number of times they're invoked and the duration of time they run, making them exceptionally affordable.
|
||||
|
||||
|
||||

|
||||
|
||||
Of course, there are cases where lambda functions may not be the best option for compute, if you have long-running, highly complex computational processes, lambda functions may not be the best fit for your application. If you're migrating an application to serverless, it's also very likely that this is not a 1:1 changeover. Throwing all of your code into one lambda function is not optimizing for serverless, meaning that your monolith architecture may need to be written into [microservices](https://aws.amazon.com/microservices/) to take full advantage of everything lambda and serverless has to offer. Whether you're migrating or building something brand new however, lambda functions are (dare I say) the choice for serverless compute as they're lightweight, easy to provision, and cost effective.
|
||||
|
||||
**AWS Fargate**
|
||||
AWS defines [Fargate](https://docs.aws.amazon.com/AmazonECS/latest/userguide/what-is-fargate.html) as a 'serverless compute engine', but I prefer to define it as a serverless container. [Containers](https://aws.amazon.com/containers/services/) are an entirely different topic of discussion so I won't dive into them too much, but they do fall under the same "[modern applications](https://aws.amazon.com/modern-apps/)" umbrella that serverless does, making them a sort of sibling option. Containerizing an application is a fancy way of saying you are bundling all aspects of your application into a more portable service (ie, basically sticking your app into a box for ease of movement and use). This makes containers very popular for migrating applications, batch processing, AI/ML, and more.
|
||||
|
||||
Fargate stands sort of in the middle as a container service that offers many of the same serverless benefits that lambda has - no need to manage your infrastructure, built-in integrations to other AWS services, and pay-as-you-go pricing. What makes you choose one compute option over the other then? In my experience, this really comes down to what you want your end product to be, what time you have, and what experience you have. Personally, Fargate is more of a 'lift-and-shift' solution for applications that you don't want to change much of, but need to move quickly and easily while wanting to take advantage of serverless benefits. It definitely has its place as part of other applications as well, but it also comes down to the level of comfort you or your teams have with serverless or containerization. Containers may be quicker to adopt, whereas serverless requires more of a mindset shift, and typically comes with some rearchitecting. I believe this does pay off tenfold in the long run, but given your particular use cases and time constraints, Fargate may be a better option.
|
||||
|
||||
|
||||

|
||||
|
||||
These two options pretty much sum up serverless compute, believe it or not. When it comes to your business logic code in AWS or other cloud provider, these two services cover most, if not all, serverless application needs. As we continue on in this series, you'll realize there are a ton of other 'supporting' serverless services for storage, APIs, orchestration, and more to dive into. I hope this has given you a good preview on serverless compute and what's to come, tune in tomorrow where we'll discuss the various serverless storage solutions available to us. See you then!
|
||||
|
||||
*This is part of a series that will be covered here, but I also encourage you to follow along with the rest of the series on [Medium](https://kristiperreault.medium.com/serverless-compute-b19df2ea0935) or [Dev.to](https://dev.to/aws-heroes/serverless-compute-3bgo).
|
After Width: | Height: | Size: 258 KiB |
BIN
2023/images/day34-1.png
Normal file
After Width: | Height: | Size: 225 KiB |
BIN
2023/images/day48-1.png
Normal file
After Width: | Height: | Size: 25 KiB |
BIN
2023/images/day48-2.png
Normal file
After Width: | Height: | Size: 573 KiB |
BIN
2023/images/day52-1.png
Normal file
After Width: | Height: | Size: 91 KiB |
BIN
2023/images/day52-2.png
Normal file
After Width: | Height: | Size: 118 KiB |
BIN
2023/images/day53-01.png
Normal file
After Width: | Height: | Size: 49 KiB |
BIN
2023/images/day53-02.png
Normal file
After Width: | Height: | Size: 227 KiB |
BIN
2023/images/day53-03.png
Normal file
After Width: | Height: | Size: 91 KiB |
BIN
2023/images/day53-04.png
Normal file
After Width: | Height: | Size: 263 KiB |
BIN
2023/images/day53-05.png
Normal file
After Width: | Height: | Size: 198 KiB |
BIN
2023/images/day54-01.png
Normal file
After Width: | Height: | Size: 53 KiB |
BIN
2023/images/day55-01.jpg
Normal file
After Width: | Height: | Size: 251 KiB |
BIN
2023/images/day63-1.png
Normal file
After Width: | Height: | Size: 43 KiB |
BIN
2023/images/day64-1.png
Normal file
After Width: | Height: | Size: 19 KiB |
BIN
2023/images/day65-1.png
Normal file
After Width: | Height: | Size: 29 KiB |
BIN
2023/images/day65-2.png
Normal file
After Width: | Height: | Size: 81 KiB |
BIN
2023/images/day65-3.png
Normal file
After Width: | Height: | Size: 24 KiB |
BIN
2023/images/day65-4.png
Normal file
After Width: | Height: | Size: 98 KiB |
BIN
2023/images/day65-5.png
Normal file
After Width: | Height: | Size: 53 KiB |
BIN
2023/images/day65-6.png
Normal file
After Width: | Height: | Size: 25 KiB |
BIN
2023/images/day65-7.png
Normal file
After Width: | Height: | Size: 109 KiB |
BIN
2023/images/day65-8.png
Normal file
After Width: | Height: | Size: 27 KiB |
BIN
2023/images/day66-1.png
Normal file
After Width: | Height: | Size: 28 KiB |
BIN
2023/images/day66-2.png
Normal file
After Width: | Height: | Size: 26 KiB |
BIN
2023/images/day67-1.png
Normal file
After Width: | Height: | Size: 28 KiB |
BIN
2023/images/day67-2.png
Normal file
After Width: | Height: | Size: 13 KiB |
BIN
2023/images/day67-3.png
Normal file
After Width: | Height: | Size: 35 KiB |
BIN
2023/images/day67-4.png
Normal file
After Width: | Height: | Size: 30 KiB |
BIN
2023/images/day67-5.png
Normal file
After Width: | Height: | Size: 35 KiB |
BIN
2023/images/day68-1.png
Normal file
After Width: | Height: | Size: 14 KiB |
BIN
2023/images/day68-2.png
Normal file
After Width: | Height: | Size: 15 KiB |
BIN
2023/images/day68-3.png
Normal file
After Width: | Height: | Size: 27 KiB |
BIN
2023/images/day68-4.png
Normal file
After Width: | Height: | Size: 60 KiB |
BIN
2023/images/day68-5.png
Normal file
After Width: | Height: | Size: 9.6 KiB |
BIN
2023/images/day68-6.png
Normal file
After Width: | Height: | Size: 32 KiB |
BIN
2023/images/day68-7.png
Normal file
After Width: | Height: | Size: 24 KiB |
BIN
2023/images/day69-1.png
Normal file
After Width: | Height: | Size: 10 KiB |
BIN
2023/images/day69-2.png
Normal file
After Width: | Height: | Size: 94 KiB |
BIN
2023/images/day71-1.jpg
Normal file
After Width: | Height: | Size: 1.3 MiB |
BIN
2023/images/day71-2.jpg
Normal file
After Width: | Height: | Size: 1.1 MiB |
BIN
2023/images/day71-3.jpg
Normal file
After Width: | Height: | Size: 116 KiB |