diff --git a/.github/workflows/post_to_telegram.yml b/.github/workflows/post_to_telegram.yml new file mode 100644 index 0000000000..357d7d385e --- /dev/null +++ b/.github/workflows/post_to_telegram.yml @@ -0,0 +1,25 @@ +name: Notify Telegram on Push + +on: + push: + branches: + - gh-pages + +jobs: + notify-telegram: + runs-on: ubuntu-latest + + steps: + # Step 1: Checkout the repository code + - name: Checkout code + uses: actions/checkout@v3 + + # Step 2: Send message to Telegram + - name: Send message to Telegram + run: | + COMMIT_MESSAGE=$(git log -1 --pretty=%B) + AUTHOR=$(git log -1 --pretty=format:'%an') + curl -s -X POST "https://api.telegram.org/bot${{ secrets.TELEGRAM_BOT_TOKEN }}/sendMessage" \ + -d chat_id="${{ secrets.TELEGRAM_CHAT_ID }}" \ + -d text="🚀 New commit pushed to *${{ github.repository }}*:\n\n*Author*: ${AUTHOR}\n*Message*: ${COMMIT_MESSAGE}\n\n🔗 [View commit](https://github.com/${{ github.repository }}/commit/${{ github.sha }})" \ + -d parse_mode="Markdown" \ No newline at end of file diff --git a/.gitignore b/.gitignore index 6526cbd49b..708a9150bf 100644 --- a/.gitignore +++ b/.gitignore @@ -1,3 +1,49 @@ +# Jekyll build artifacts _site/ -_draft/ +.sass-cache/ .jekyll-cache/ +.jekyll-metadata + +# Posts generator directory (contains sensitive config files) +posts-generator/ + +# Configuration files with sensitive data +configs/ +# Service account keys (extra protection) +configs/service-account.json + +# Python +__pycache__/ +*.py[cod] +*$py.class +*.so +.Python +env/ +venv/ +ENV/ +env.bak/ +venv.bak/ + +# macOS +.DS_Store +.AppleDouble +.LSOverride +.DS_Store + +# Windows +Thumbs.db +ehthumbs.db +Desktop.ini + +# IDE +.vscode/ +.idea/ +*.swp +*.swo +*~ + +# Logs +*.log + +.specstory/ +.cursorindexingignore \ No newline at end of file diff --git a/AGENTS.md b/AGENTS.md new file mode 100644 index 0000000000..4c004fa65f --- /dev/null +++ b/AGENTS.md @@ -0,0 +1,151 @@ +# AGENTS.md + +## Project Overview + +BeOps is a comprehensive documentation site covering DevOps best practices, Kubernetes, and Site Reliability Engineering (SRE) principles. The project includes: + +- **Jekyll-based documentation site** with GitHub Pages hosting +- **Python content generation tools** in `posts-generator/` directory +- **AI-powered content creation** using OpenAI and Google Gemini APIs + +## Setup Commands + +### Jekyll Site Setup +```bash +# Install Ruby dependencies +bundle install + +# Start local development server +bundle exec jekyll serve --livereload + +# Build for production +bundle exec jekyll build +``` + +### Python Tools Setup +```bash +# Navigate to posts-generator directory +cd posts-generator + +# Create virtual environment +python -m venv py-feedparser + +# Activate virtual environment +source py-feedparser/bin/activate # On macOS/Linux +# or +py-feedparser\Scripts\activate # On Windows + +# Install dependencies +pip install -r requirements.txt +``` + +## Code Style + +### Python (posts-generator/) +- Follow PEP 8 style guidelines +- Use meaningful variable names +- Add docstrings for functions and classes +- Use type hints where appropriate +- Keep functions focused and single-purpose + +### Jekyll/Markdown +- Use consistent front matter format +- Follow Jekyll naming conventions for posts +- Use descriptive file names with dates +- Maintain consistent heading hierarchy + +## Testing Instructions + +### Python Tools Testing +```bash +cd posts-generator +python py-feedparser.py --test +python title_generator.py --test +python youtube_processor.py --test +``` + +### Jekyll Site Testing +```bash +# Test build locally +bundle exec jekyll build + +# Check for broken links +bundle exec jekyll build --verbose +``` + +## Content Generation Workflow + +### AI Content Creation +1. **Configuration**: Use config files in `posts-generator/configs/` +2. **Prompts**: Store AI prompts in `posts-generator/prompts.json` +3. **Output**: Generated content goes to `posts-generator/produced_posts/` +4. **Logging**: Check `posts-generator/logs/` for execution logs + +### Key Files +- `py-feedparser.py`: Main content generation script +- `title_generator.py`: AI-powered title generation +- `youtube_processor.py`: YouTube content processing +- `openai_worker_4o.py`: OpenAI API integration + +## Security Considerations + +- **API Keys**: Store in `.env` file (not in version control) +- **Service Accounts**: Use `configs/service-account.json` for Google APIs +- **Configurations**: Keep sensitive configs out of public repos + +## File Structure Guidelines + +### Posts +- Store in `_posts/` with format: `YYYY-MM-DD-title.md` +- Use consistent front matter +- Include proper categories and tags + +### Assets +- Images go in `assets/` +- Keep file sizes optimized +- Use descriptive filenames + +## Deployment + +### GitHub Pages +- Site automatically builds on push to main branch +- Check GitHub Actions for build status +- Live site: https://neverthesame.github.io/BeOps/ +- All authoring and content changes must be made from the `gh-pages` branch. + +### Content Updates +1. Generate new content using Python tools +2. Review and edit generated content +3. Add to `_posts/` directory +4. Commit and push to trigger rebuild + +## Common Tasks + +### Adding New Content +```bash +cd posts-generator +python py-feedparser.py --config config-4o.json +``` + +### Updating Dependencies +```bash +# Python +pip freeze > requirements.txt + +# Ruby +bundle update +``` + +### Troubleshooting +- Check logs in `posts-generator/logs/` +- Verify API configurations +- Ensure virtual environment is activated +- Check Jekyll build output for errors + +## AI Integration Notes + +- **OpenAI**: Use GPT-4o for content generation +- **Google Gemini**: Use for YouTube content processing +- **Prompt Management**: All prompts stored in `prompts.json` +- **Rate Limiting**: Implement proper delays between API calls +- **Error Handling**: Log all API interactions for debugging diff --git a/Gemfile b/Gemfile index d563868f7b..4fd8c7ad47 100644 --- a/Gemfile +++ b/Gemfile @@ -5,7 +5,6 @@ source "https://rubygems.org" git_source(:github) { |repo_name| "https://github.com/#{repo_name}" } gem "jekyll" gem 'jekyll-feed' -gem 'jekyll-readme-index' gem 'jemoji' gem 'webrick' diff --git a/Gemfile.lock b/Gemfile.lock new file mode 100644 index 0000000000..d8cdb4784c --- /dev/null +++ b/Gemfile.lock @@ -0,0 +1,97 @@ +GEM + remote: https://rubygems.org/ + specs: + activesupport (6.1.7.10) + concurrent-ruby (~> 1.0, >= 1.0.2) + i18n (>= 1.6, < 2) + minitest (>= 5.1) + tzinfo (~> 2.0) + zeitwerk (~> 2.3) + addressable (2.8.7) + public_suffix (>= 2.0.2, < 7.0) + colorator (1.1.0) + concurrent-ruby (1.3.4) + em-websocket (0.5.3) + eventmachine (>= 0.12.9) + http_parser.rb (~> 0) + eventmachine (1.2.7) + ffi (1.17.0-arm64-darwin) + forwardable-extended (2.6.0) + gemoji (4.1.0) + google-protobuf (3.23.4-arm64-darwin) + html-pipeline (2.14.3) + activesupport (>= 2) + nokogiri (>= 1.4) + http_parser.rb (0.8.0) + i18n (1.14.6) + concurrent-ruby (~> 1.0) + jekyll (4.3.4) + addressable (~> 2.4) + colorator (~> 1.0) + em-websocket (~> 0.5) + i18n (~> 1.0) + jekyll-sass-converter (>= 2.0, < 4.0) + jekyll-watch (~> 2.0) + kramdown (~> 2.3, >= 2.3.1) + kramdown-parser-gfm (~> 1.0) + liquid (~> 4.0) + mercenary (>= 0.3.6, < 0.5) + pathutil (~> 0.9) + rouge (>= 3.0, < 5.0) + safe_yaml (~> 1.0) + terminal-table (>= 1.8, < 4.0) + webrick (~> 1.7) + jekyll-feed (0.17.0) + jekyll (>= 3.7, < 5.0) + jekyll-sass-converter (3.0.0) + sass-embedded (~> 1.54) + jekyll-watch (2.2.1) + listen (~> 3.0) + jemoji (0.13.0) + gemoji (>= 3, < 5) + html-pipeline (~> 2.2) + jekyll (>= 3.0, < 5.0) + kramdown (2.4.0) + rexml + kramdown-parser-gfm (1.1.0) + kramdown (~> 2.0) + liquid (4.0.4) + listen (3.9.0) + rb-fsevent (~> 0.10, >= 0.10.3) + rb-inotify (~> 0.9, >= 0.9.10) + mercenary (0.4.0) + minitest (5.25.1) + nokogiri (1.13.10-arm64-darwin) + racc (~> 1.4) + pathutil (0.16.2) + forwardable-extended (~> 2.6) + public_suffix (5.1.1) + racc (1.8.1) + rb-fsevent (0.11.2) + rb-inotify (0.11.1) + ffi (~> 1.0) + rexml (3.3.9) + rouge (3.30.0) + safe_yaml (1.0.5) + sass-embedded (1.58.3-arm64-darwin) + google-protobuf (~> 3.21) + terminal-table (3.0.2) + unicode-display_width (>= 1.1.1, < 3) + tzinfo (2.0.6) + concurrent-ruby (~> 1.0) + unicode-display_width (2.6.0) + webrick (1.9.0) + zeitwerk (2.6.18) + +PLATFORMS + arm64-darwin-24 + arm64-darwin-25 + +DEPENDENCIES + jekyll + jekyll-feed + jemoji + webrick + +BUNDLED WITH + 2.4.10 diff --git a/README.md b/README.md index fbe2938a01..67e9e89e0e 100644 --- a/README.md +++ b/README.md @@ -1,235 +1,156 @@ ---- -layout: home -title: Jekyll Gitbook Theme -permalink: / ---- +# BeOps - Best Practices for DevOps and SRE -Make Jelly site have a GitBook look! +A comprehensive documentation site covering DevOps best practices, Kubernetes, and Site Reliability Engineering (SRE) principles. -## Demo +## 🚀 Features -Live demo on Github Pages: [https://sighingnow.github.io/jekyll-gitbook](https://sighingnow.github.io/jekyll-gitbook) +- **DevOps Best Practices**: Practical guides and tutorials +- **Kubernetes Deep Dives**: From basics to advanced concepts +- **SRE Principles**: Site Reliability Engineering methodologies +- **Ingress Controllers**: Comprehensive coverage of Kubernetes networking +- **AI-Powered Content Generation**: Automated blog post creation using OpenAI and Google Gemini APIs -[![Jekyll Themes](https://img.shields.io/badge/featured%20on-JekyllThemes-red.svg)](https://jekyll-themes.com/jekyll-gitbook/) +## 📚 Content -## Why Jekyll with GitBook +This site is built with Jekyll and hosted on GitHub Pages. Visit the live site at: https://neverthesame.github.io/BeOps/ -GitBook is an amazing frontend style to present and organize contents (such as book chapters -and blogs) on Web. The typical to deploy GitBook at [Github Pages][1] -is building HTML files locally and then push to Github repository, usually to the `gh-pages` -branch. It's quite annoying to repeat such workload and make it hard for people do version -control via git for when there are generated HTML files to be staged in and out. +## 🏗️ Project Structure -This theme takes style definition out of generated GitBook site and provided the template -for Jekyll to rendering markdown documents to HTML, thus the whole site can be deployed -to [Github Pages][1] without generating and uploading HTML bundle every time when there are -changes to the original repo. - -## How to Get Started - -This theme can be used just as other [Jekyll themes][1] and support [remote theme][12], -see [the official guide][13] as well. - -You can introduce this jekyll theme into your own site by either - -- [Fork][3] this repository and add your markdown posts to the `_posts` folder. -- Use as a remote theme in your [`_config.yml`][14](just like what we do for this - site itself), - -```yaml -remote_theme: sighingnow/jekyll-gitbook -``` - -### Deploy Locally with Jekyll Serve - -This theme can be ran locally using Ruby and Gemfiles. - -[Testing your GitHub Pages site locally with Jekyll](https://docs.github.com/en/pages/setting-up-a-github-pages-site-with-jekyll/testing-your-github-pages-site-locally-with-jekyll) - GitHub - -## Full-text search - -The search functionality in jekyll-gitbook theme is powered by the [gitbook-plugin-search-pro][5] plugin and is enabled by default. - -[https://sighingnow.github.io/jekyll-gitbook/?q=generated](https://sighingnow.github.io/jekyll-gitbook/?q=generated) - -## Code highlight - -The code highlight style is configurable the following entry in `_config.yaml`: - -```yaml -syntax_highlighter_style: colorful ``` - -The default code highlight style is `colorful`, the full supported styles can be found from [the rouge repository][6]. Customized -style can be added to [./assets/gitbook/rouge/](./assets/gitbook/rouge/). - -## How to generate TOC - -The jekyll-gitbook theme leverages [jekyll-toc][4] to generate the *Contents* for the page. -The TOC feature is not enabled by default. To use the TOC feature, modify the TOC -configuration in `_config.yml`: - -```yaml -toc: - enabled: true - h_min: 1 - h_max: 3 +BeOps/ +├── _posts/ # Blog posts (Jekyll format) +├── _pages/ # Static pages +├── _layouts/ # Jekyll layouts +├── _includes/ # Jekyll includes +├── assets/ # Images and static assets +├── posts-generator/ # Python content generation tools +│ ├── py-feedparser.py # Main content generation script +│ ├── title_generator.py # AI-powered title generation +│ ├── youtube_processor.py # YouTube content processing +│ ├── openai_worker_4o.py # OpenAI API integration +│ ├── prompts.json # AI prompts configuration +│ ├── configs/ # Configuration files +│ ├── produced_posts/ # Generated content output +│ └── logs/ # Execution logs +├── configs/ # Project configuration files +├── AGENTS.md # AI agent instructions (see below) +└── README.md # This file ``` -## Google Analytics, etc. - -The jekyll-gitboook theme supports embedding the [Google Analytics][7], [CNZZ][8] and [Application Insights][9] website analytical tools with the following -minimal configuration in `_config.yaml`: +## 🛠️ Local Development -```yaml -tracker: - google_analytics: "" -``` - -Similarly, CNZZ can be added with the following configuration in `_config.yaml` +### Jekyll Site Setup -```yaml -tracker: - cnzz: "" -``` +```bash +# Install Ruby dependencies +bundle install -Application Insights can be added with the following configuration in `_config.yaml` +# Start local development server with live reload +bundle exec jekyll serve --livereload -```yaml -tracker: - application_insights: "" +# Build for production +bundle exec jekyll build ``` -## Disqus comments - -[Disqus](https://disqus.com/) comments can be enabled by adding the following configuration in `_config.yaml`: +The site will be available at `http://localhost:4000` -```yaml -disqushandler: "" -``` +### Python Content Generation Tools -## Jekyll collections +The project includes AI-powered tools for generating blog content: -Jekyll's [collections][15] is supported to organize the pages in a more fine-grained manner, e.g., +```bash +# Navigate to posts-generator directory +cd posts-generator -```yaml -collections: - pages: - output: true - sort_by: date - permalink: /:collection/:year-:month-:day-:title:output_ext - others: - output: true - sort_by: date - permalink: /:collection/:year-:month-:day-:title:output_ext -``` +# Create and activate virtual environment +python -m venv py-feedparser +source py-feedparser/bin/activate # On macOS/Linux +# or +py-feedparser\Scripts\activate # On Windows -An optional `ordered_collections` key can be added to `_config.yaml` to control the order of collections in the sidebar: +# Install dependencies +pip install -r requirements.txt -```yaml -ordered_collections: - - posts - - pages - - others +# Generate new content +python py-feedparser.py --config config-4o.json ``` -If not specified, the order of collections would be decided by Jekyll. Note that the key `posts` is a special collection -that indicates the `_posts` pages of Jekyll. - -## Extra StyleSheet or Javascript elements - -You can add extra CSS or JavaScript references using configuration collections: +**Key Tools:** +- `py-feedparser.py`: Main content generation from RSS feeds and other sources +- `title_generator.py`: AI-powered title generation +- `youtube_processor.py`: Process YouTube videos into blog posts +- `openai_worker_4o.py`: OpenAI GPT-4o integration -- extra_css: for additional style sheets. If the url does not start by http, the path must be relative to the root of the site, without a starting `/`. -- extra_header_js: for additional scripts to be included in the `` tag, after the `extra_css` has been added. If the url does not start by http, the path must be relative to the root of the site, without a starting `/`. -- extra_footer_js: for additional scripts to be included at the end of the HTML document, just before the site tracking script. If the url does not start by http, the path must be relative to the root of the site, without a starting `/`. +**Configuration:** +- API keys should be stored in `.env` file (not in version control) +- Configuration files are in `posts-generator/configs/` directory +- Generated posts are saved to `posts-generator/produced_posts/` +- Logs are available in `posts-generator/logs/` -## Customizing font settings +## 🧪 Testing -The fonts can be customized by modifying the `.book.font-family-0` and `.book.font-family-1` entry in [`./assets/gitbook/custom.css`][10], +### Jekyll Site +```bash +# Test build locally +bundle exec jekyll build -```css -.book.font-family-0 { - font-family: Georgia, serif; -} -.book.font-family-1 { - font-family: "Helvetica Neue", Helvetica, Arial, sans-serif; -} +# Check for broken links +bundle exec jekyll build --verbose ``` -## Tips, Warnings and Dangers blocks - -The jekyll-gitbook theme supports customized kramdown attributes (`{: .block-tip }`, `{: .block-warning }`, -`{: .block-danger }`) like that displayed in [the discord.js website][11]. The marker can be used like - -```markdown -> ##### TIP -> -> This guide is last tested with @napi-rs/canvas^0.1.20, so make sure you have -> this or a similar version after installation. -{: .block-tip } +### Python Tools +```bash +cd posts-generator +python py-feedparser.py --test +python title_generator.py --test +python youtube_processor.py --test ``` -Rendered page can be previewed from +## 📝 Contributing -[https://sighingnow.github.io/jekyll-gitbook/jekyll/2022-06-30-tips_warnings_dangers.html](https://sighingnow.github.io/jekyll-gitbook/jekyll/2022-06-30-tips_warnings_dangers.html) +We welcome contributions! Here's how you can help: -## Cover image inside pages +1. **Content Contributions**: Submit new blog posts or improve existing ones +2. **Bug Reports**: Report issues via GitHub Issues +3. **Feature Requests**: Suggest enhancements and new features +4. **Code Improvements**: Improve the content generation tools -The jekyll-gitbook theme supports adding a cover image to a specific page by adding -a `cover` field to the page metadata: +### Adding New Content -```diff - --- - title: Page with cover image - author: Tao He - date: 2022-05-24 - category: Jekyll - layout: post -+ cover: /assets/jekyll-gitbook/dinosaur.gif - --- -``` +1. Generate content using the Python tools (see above) +2. Review and edit generated content +3. Add to `_posts/` directory with format: `YYYY-MM-DD-title.md` +4. Follow Jekyll front matter conventions +5. Submit a pull request -The effect can be previewed from +### Code Style -[https://sighingnow.github.io/jekyll-gitbook/jekyll/2022-05-24-page_cover.html](https://sighingnow.github.io/jekyll-gitbook/jekyll/2022-05-24-page_cover.html) +- **Python**: Follow PEP 8 guidelines, use type hints, add docstrings +- **Markdown**: Use consistent front matter, follow Jekyll naming conventions +- **Commits**: Use descriptive commit messages -## Diagrams with mermaid.js +## 🔒 Security -This jekyll-theme supports [mermaid.js](https://mermaid.js.org/) to render diagrams -in markdown. +- **API Keys**: Never commit API keys to the repository. Use `.env` files +- **Service Accounts**: Google service account files should not be in public repos +- **Sensitive Data**: Keep all sensitive configurations out of version control -To enable the mermaid support, you need to set `mermaid: true` in the front matter -of your post. +## 🚀 Deployment -```markdown ---- -mermaid: true ---- -``` +The site is automatically deployed to GitHub Pages on push to the main branch. The build process is handled by GitHub Actions. -The example can be previewed from +**Deployment Workflow:** +1. Generate new content using Python tools +2. Review and edit generated content +3. Add to `_posts/` directory +4. Commit and push to trigger automatic rebuild -[https://sighingnow.github.io/jekyll-gitbook/jekyll/2023-08-31-mermaid.html](https://sighingnow.github.io/jekyll-gitbook/jekyll/2023-08-31-mermaid.html) +## 🤖 AI Agent Support -## License +This project includes an `AGENTS.md` file that provides detailed instructions for AI coding agents working on this project. If you're using AI assistants like Cursor, GitHub Copilot, or similar tools, they will automatically reference `AGENTS.md` for project-specific guidance. -This work is open sourced under the Apache License, Version 2.0. +**For AI Agents**: See `AGENTS.md` for comprehensive setup instructions, code style guidelines, testing procedures, and workflow documentation. -Copyright 2019 Tao He. +## 📄 License -[1]: https://pages.github.com -[2]: https://pages.github.com/themes -[3]: https://github.com/sighingnow/jekyll-gitbook/fork -[4]: https://github.com/allejo/jekyll-toc -[5]: https://github.com/gitbook-plugins/gitbook-plugin-search-pro -[6]: https://github.com/rouge-ruby/rouge/tree/master/lib/rouge/themes -[7]: https://analytics.google.com/analytics/web/ -[8]: https://www.cnzz.com/ -[9]: https://docs.microsoft.com/en-us/azure/azure-monitor/app/app-insights-overview -[10]: https://github.com/sighingnow/jekyll-gitbook/blob/master/gitbook/custom.css -[11]: https://discordjs.guide/popular-topics/canvas.html#setting-up-napi-rs-canvas -[12]: https://rubygems.org/gems/jekyll-remote-theme -[13]: https://docs.github.com/en/pages/setting-up-a-github-pages-site-with-jekyll/adding-a-theme-to-your-github-pages-site-using-jekyll -[14]: https://github.com/sighingnow/jekyll-gitbook/blob/master/_config.yml -[15]: https://jekyllrb.com/docs/collections/ +This project is licensed under the MIT License. \ No newline at end of file diff --git a/_config.yml b/_config.yml index f4c1c9c7a2..9241dc4622 100644 --- a/_config.yml +++ b/_config.yml @@ -1,16 +1,16 @@ # Configurations -title: Jekyll Gitbook -longtitle: Jekyll Gitbook -author: HE Tao -email: sighingnow@gmail.com +title: BeOps +longtitle: BeOps - Best Practices for DevOps and SRE. +author: Kirill Kuklin +email: beops.it@gmail.com description: > - Build Jekyll site with the GitBook style. + BeOps - Best Practices for DevOps and SRE. version: 1.0 gitbook_version: 3.2.3 -url: 'https://sighingnow.github.io' -baseurl: '/jekyll-gitbook' +url: 'https://NeverTheSame.github.io' +baseurl: '/BeOps' rss: RSS # bootstrap: use the remote theme for the site itself @@ -70,5 +70,4 @@ regenerate: true plugins: - jekyll-feed - - jekyll-readme-index - jemoji diff --git a/_posts/2019-04-27-why.md b/_posts/2019-04-27-why.md deleted file mode 100644 index 81af6b7b61..0000000000 --- a/_posts/2019-04-27-why.md +++ /dev/null @@ -1,20 +0,0 @@ ---- -title: Why Jekyll with GitBook -author: Tao He -date: 2019-04-27 -category: Jekyll -layout: post ---- - -GitBook is an amazing frontend style to present and organize contents (such as book chapters -and blogs) on Web. The typical to deploy GitBook at [Github Pages][1] -is building HTML files locally and then push to Github repository, usually to the `gh-pages` -branch. However, it's quite annoying to repeat such workload and make it hard for people do -version control via git for when there are generated HTML files to be staged in and out. - -This theme takes style definition out of generated GitBook site and provided the template -for Jekyll to rendering markdown documents to HTML, thus the whole site can be deployed -to [Github Pages][1] without generating and uploading HTML bundle every time when there are -changes to the original repository. - -[1]: https://pages.github.com \ No newline at end of file diff --git a/_posts/2019-04-28-howto.md b/_posts/2019-04-28-howto.md deleted file mode 100644 index d1e9750fb1..0000000000 --- a/_posts/2019-04-28-howto.md +++ /dev/null @@ -1,38 +0,0 @@ ---- -title: How to Get Started -author: Tao He -date: 2019-04-28 -category: Jekyll -layout: post ---- - -The jekyll-gitbook theme can be used just as other [Jekyll themes][3] and -support [remote theme][2] on [Github Pages][1], see [the official guide][4] -as well. - -You can introduce this jekyll theme into your own site by either - -- [Fork][5] this repository and add your markdown posts to the `_posts` folder, then - push to your own Github repository. -- Use as a remote theme in your [`_config.yml`][6](just like what we do for this - site itself), - -```yaml -# Configurations -title: Jekyll Gitbook -longtitle: Jekyll Gitbook - -remote_theme: sighingnow/jekyll-gitbook -``` - -> ##### TIP -> -> No need to push generated HTML bundle. -{: .block-tip } - -[1]: https://pages.github.com -[2]: https://github.com/sighingnow/jekyll-gitbook/fork -[3]: https://pages.github.com/themes -[4]: https://docs.github.com/en/pages/setting-up-a-github-pages-site-with-jekyll/adding-a-theme-to-your-github-pages-site-using-jekyll -[5]: https://github.com/sighingnow/jekyll-gitbook/fork -[6]: https://github.com/sighingnow/jekyll-gitbook/blob/master/_config.yml diff --git a/_posts/2019-04-29-license.md b/_posts/2019-04-29-license.md deleted file mode 100644 index 793e52a6ee..0000000000 --- a/_posts/2019-04-29-license.md +++ /dev/null @@ -1,12 +0,0 @@ ---- -title: License -author: Tao He -date: 2019-04-29 -category: Jekyll -layout: post ---- - -This work is open sourced under the Apache License, Version 2.0, using the -same license as the original [GitBook](https://github.com/GitbookIO/gitbook) repository. - -Copyright 2019 Tao He. diff --git a/_posts/2021-08-10-toc.md b/_posts/2021-08-10-toc.md deleted file mode 100644 index 6ad1f81967..0000000000 --- a/_posts/2021-08-10-toc.md +++ /dev/null @@ -1,176 +0,0 @@ ---- -title: How to Generate TOC -author: Tao He -date: 2021-08-10 -category: Jekyll -layout: post ---- - -The jekyll-gitbook theme leverages [jekyll-toc][1] to generate the *Contents* for the page. -The TOC feature is not enabled by default. To use the TOC feature, modify the TOC -configuration in `_config.yml`: - -```yaml -toc: - enabled: true -``` - -Why this repo -------------- - -long contents ..... - -1. a -2. b -3. c -4. d - -### Sub title 1 - -### Sub title 2 - -### Sub title 3 - -Why this repo -------------- - -long contents ..... - -+ 1 -+ 2 -+ 3 -+ 4 - -Why this repo -------------- - -long contents ..... - -1. e -2. f -3. g -4. h - -Why this repo -------------- - -+ 5 -+ 6 -+ 7 -+ 8 - -Why this repo -------------- - -long contents ..... - -1. a -2. b -3. c -4. d - -Why this repo -------------- - -long contents ..... - -+ 1 -+ 2 -+ 3 -+ 4 - -Why this repo -------------- - -long contents ..... - -1. e -2. f -3. g -4. h - -Why this repo -------------- - -+ 5 -+ 6 -+ 7 -+ 8 - -Why this repo -------------- - -long contents ..... - -1. a -2. b -3. c -4. d - -Why this repo -------------- - -long contents ..... - -+ 1 -+ 2 -+ 3 -+ 4 - -Why this repo -------------- - -long contents ..... - -1. e -2. f -3. g -4. h - -Why this repo -------------- - -+ 5 -+ 6 -+ 7 -+ 8 - -Why this repo -------------- - -long contents ..... - -1. a -2. b -3. c -4. d - -Why this repo -------------- - -long contents ..... - -+ 1 -+ 2 -+ 3 -+ 4 - -Why this repo -------------- - -long contents ..... - -1. e -2. f -3. g -4. h - -Why this repo -------------- - -+ 5 -+ 6 -+ 7 -+ 8 - -[1]: https://github.com/allejo/jekyll-toc diff --git a/_posts/2022-05-24-page_cover.md b/_posts/2022-05-24-page_cover.md deleted file mode 100644 index 44a57e2fbe..0000000000 --- a/_posts/2022-05-24-page_cover.md +++ /dev/null @@ -1,22 +0,0 @@ ---- -title: Page with cover image -author: Tao He -date: 2022-05-24 -category: Jekyll -layout: post -cover: https://sighingnow.github.io/jekyll-gitbook/assets/dinosaur.gif ---- - -The jekyll-gitbook theme supports adding a cover image to a specific page by adding -a `cover` field to the page metadata: - -```diff - --- - title: Page with cover image - author: Tao He - date: 2022-05-24 - category: Jekyll - layout: post -+ cover: /assets/jekyll-gitbook/dinosaur.gif - --- -``` diff --git a/_posts/2022-06-26-wide_tables.md b/_posts/2022-06-26-wide_tables.md deleted file mode 100644 index 9693b161ea..0000000000 --- a/_posts/2022-06-26-wide_tables.md +++ /dev/null @@ -1,36 +0,0 @@ ---- -title: Wide tables -author: Tao He -date: 2022-06-26 -category: Jekyll -layout: post ---- - -A wide tables needs to be wrapped into a `div` with class `table-wrapper` -to make sure it displayed as expected on mobile devices. For example, - -```markdown -
- -|title1|title2|title3|title4|title5|title6|title7|title8| -|:-:|:-:|:-:|:-:|:-:|:-:|:-:|:-:| -|1|2|3|4|5|6|7|8| -|1|2|3|4|5|6|7|8| -|1|2|3|4|5|6|7|8| -|1|2|3|4|5|6|7|8| - -
-``` - -Will be rendered as - -
- -|title1|title2|title3|title4|title5|title6|title7|title8| -|:-:|:-:|:-:|:-:|:-:|:-:|:-:|:-:| -|1|2|3|4|5|6|7|8| -|1|2|3|4|5|6|7|8| -|1|2|3|4|5|6|7|8| -|1|2|3|4|5|6|7|8| - -
diff --git a/_posts/2022-06-30-tips_warnings_dangers.md b/_posts/2022-06-30-tips_warnings_dangers.md deleted file mode 100644 index 0dfb55d1f6..0000000000 --- a/_posts/2022-06-30-tips_warnings_dangers.md +++ /dev/null @@ -1,67 +0,0 @@ ---- -title: Tips, Warnings, and Dangers -author: Tao He -date: 2022-06-30 -category: Jekyll -layout: post ---- - -This jekyll-theme supports tips, warnings, and dangers blocks and the style is referred -from [the discord.js website][1]. - -You could have the following [markdown attributes (supported by kramdown)][2]: - -### Tips - -Using a `{: .block-tip}` attribute: - -```markdown -> ##### TIP -> -> This guide is last tested with @napi-rs/canvas^0.1.20, so make sure you have -> this or a similar version after installation. -{: .block-tip } -``` - -> ##### TIP -> -> This guide is last tested with @napi-rs/canvas^0.1.20, so make sure you have -> this or a similar version after installation. -{: .block-tip } - -### Warnings - -Using a `{: .block-warning}` attribute: - -```markdown -> ##### WARNING -> -> Be sure that you're familiar with things like async/await and object destructuring -> before continuing, as we'll be making use of features like these. -{: .block-warning } -``` - -> ##### WARNING -> -> Be sure that you're familiar with things like async/await and object destructuring -> before continuing, as we'll be making use of features like these. -{: .block-warning } - -### Dangers - -Using a `{: .block-danger}` attribute: - -```markdown -> ##### DANGER -> -> You cannot delete an ephemeral message. -{: .block-danger } -``` - -> ##### DANGER -> -> You cannot delete an ephemeral message. -{: .block-danger } - -[1]: https://discordjs.guide/popular-topics/canvas.html#setting-up-napi-rs-canvas -[2]: https://kramdown.gettalong.org/quickref.html#block-attributes diff --git a/_posts/2023-08-31-mermaid.md b/_posts/2023-08-31-mermaid.md deleted file mode 100644 index 5aec3057a3..0000000000 --- a/_posts/2023-08-31-mermaid.md +++ /dev/null @@ -1,73 +0,0 @@ ---- -title: Diagrams with mermaid.js -author: Tao He -date: 2023-08-31 -category: Jekyll -layout: post -mermaid: true ---- - -This jekyll-theme supports [mermaid.js](https://mermaid.js.org/) to render diagrams -in markdown. - -To enable the mermaid support, you need to set `mermaid: true` in the front matter -of your post. - -```markdown ---- -title: Diagrams with mermaid.js -date: 2023-08-31 -layout: post -mermaid: true ---- -``` - -Then you can use mermaid syntax in your markdown: - -``` -graph TD; - A-->B; - A-->C; - B-->D; - C-->D; -``` - -```mermaid -graph TD; - A-->B; - A-->C; - B-->D; - C-->D; -``` - -Or, even some complex examples: - -``` -sequenceDiagram - participant Alice - participant Bob - Alice->>John: Hello John, how are you? - loop Healthcheck - John->>John: Fight against hypochondria - end - Note right of John: Rational thoughts
prevail! - John-->>Alice: Great! - John->>Bob: How about you? - Bob-->>John: Jolly good! -``` - -```mermaid -sequenceDiagram - participant Alice - participant Bob - Alice->>John: Hello John, how are you? - loop Healthcheck - John->>John: Fight against hypochondria - end - Note right of John: Rational thoughts
prevail! - John-->>Alice: Great! - John->>Bob: How about you? - Bob-->>John: Jolly good! -``` - -Refer to the [mermaid.js website](https://mermaid.js.org/intro/) for more examples. diff --git a/_posts/2023-10-14-math-latex.md b/_posts/2023-10-14-math-latex.md deleted file mode 100644 index 36a9cd353c..0000000000 --- a/_posts/2023-10-14-math-latex.md +++ /dev/null @@ -1,47 +0,0 @@ ---- -title: MathJax and LaTeX -author: Tao He -date: 2023-10-14 -category: Jekyll -layout: post -mermaid: true ---- - -This jekyll-theme supports [MathJax](https://www.mathjax.org/) to render $\LaTeX$ -and mathematics expressions. - -> ##### TIP -> -> Currently, Kramdown uses double dollar sign delimiters for inline and display math: -> [https://kramdown.gettalong.org/syntax.html#math-blocks](https://kramdown.gettalong.org/syntax.html#math-blocks). -{: .block-tip } - -e.g., - -```markdown -The well known Pythagorean theorem $x^2 + y^2 = z^2$ was -proved to be invalid for other exponents. -Meaning the next equation has no integer solutions: - -$$ x^n + y^n = z^n $$ -``` - -The well known Pythagorean theorem $x^2 + y^2 = z^2$ was -proved to be invalid for other exponents. -Meaning the next equation has no integer solutions: - -$$ x^n + y^n = z^n $$ - -Another example with more complex markups: - -```markdown -When $a \ne 0$, there are two solutions to $ax^2 + bx + c = 0$ and they are - -$$x = {-b \pm \sqrt{b^2-4ac} \over 2a}.$$ -``` - -When $a \ne 0$, there are two solutions to $ax^2 + bx + c = 0$ and they are - -$$x = {-b \pm \sqrt{b^2-4ac} \over 2a}.$$ - -Refer to the [MathJax website](https://docs.mathjax.org/en/latest/index.html) for more examples. diff --git a/_posts/2023-12-12-footnotes.md b/_posts/2023-12-12-footnotes.md deleted file mode 100644 index eae423aff8..0000000000 --- a/_posts/2023-12-12-footnotes.md +++ /dev/null @@ -1,125 +0,0 @@ ---- -title: Using Footnotes -author: Tao He -date: 2023-12-12 -category: Jekyll -layout: post -mermaid: true ---- - -This jekyll-theme supports [MathJax](https://www.mathjax.org/) to render footnotes -in markdown. - -e.g., - -```markdown -The well known Pythagorean theorem $x^2 + y^2 = z^2$ was -proved to be invalid for other exponents[^1]. -Meaning the next equation has no integer solutions: - -$$ x^n + y^n = z^n $$ -``` - -The well known Pythagorean theorem $x^2 + y^2 = z^2$ was -proved to be invalid for other exponents[^1]. -Meaning the next equation has no integer solutions: - -$$ x^n + y^n = z^n $$ - -Long contents -------------- - -long contents ..... - -1. a -2. b -3. c -4. d - -### Sub title 1 - -### Sub title 2 - -### Sub title 3 - -Long contents -------------- - -long contents ..... - -1. a -2. b -3. c -4. d - -### Sub title 1 - -### Sub title 2 - -### Sub title 3 - -Long contents -------------- - -long contents ..... - -1. a -2. b -3. c -4. d - -### Sub title 1 - -### Sub title 2 - -### Sub title 3 - -Long contents -------------- - -long contents ..... - -1. a -2. b -3. c -4. d - -### Sub title 1 - -### Sub title 2 - -### Sub title 3 - -Long contents -------------- - -long contents ..... - -1. a -2. b -3. c -4. d - -### Sub title 1 - -### Sub title 2 - -### Sub title 3 - -Long contents -------------- - -long contents ..... - -1. a -2. b -3. c -4. d - -### Sub title 1 - -### Sub title 2 - -### Sub title 3 - -[^1]: [https://en.wikipedia.org/wiki/Fermat%27s_Last_Theorem](https://en.wikipedia.org/wiki/Fermat%27s_Last_Theorem) diff --git a/_posts/2024-12-15-kubernetes.md b/_posts/2024-12-15-kubernetes.md new file mode 100644 index 0000000000..04ce78a5e6 --- /dev/null +++ b/_posts/2024-12-15-kubernetes.md @@ -0,0 +1,91 @@ +--- +title: Readiness Probes +author: Kirill Kuklin +date: 2024-12-15 +category: k8s +layout: post +cover: ../assets/k8s-probes.gif +--- + +# Readiness Probes + +## What Are Readiness Probes and Why Do You Need Them? 🚀 + +When you deploy an application in Kubernetes, you want it to run smoothly without any hiccups. But sometimes, getting +an application up and running isn't as simple as "just start it and go." Your app might need time to initialize a +database, load dependencies, or simply ensure that it's ready to handle requests. That's where *readiness probes* +come into play — small but incredibly useful helpers that let Kubernetes know: "Hey, is this pod actually ready?" + +## Why Are Readiness Probes Important? + +Here's the scenario: your application starts, but it's not ready to process requests just yet. Without a readiness +probe, Kubernetes might immediately begin sending traffic to it. The result? Errors, unhappy users, and you scrambling +to figure out what went wrong. + +A readiness probe, however, checks your app's state and tells Kubernetes: "Hold on, not ready yet. Give me a few +seconds to settle." Only when your application signals that it's ready does Kubernetes start routing traffic to it. + +## A Simple Example 🧪 + +Let's say you have an application that needs to check its database connection before it can handle requests. In your pod's manifest, you'd configure a readiness probe like this: + +```yaml +readinessProbe: + httpGet: + path: /healthz + port: 8080 + initialDelaySeconds: 5 + periodSeconds: 3 +``` + +### What's Happening: +- Kubernetes sends HTTP requests to http://your_pod:8080/healthz to check if the application is ready. +- `initialDelaySeconds` is the time Kubernetes waits after the pod starts before it begins checking. +- `periodSeconds` is how often Kubernetes will perform the check. + +If your app returns a 200 OK response, Kubernetes knows it's ready to receive traffic. + +## When Does This Save the Day? + +1. **Microservices**: If one service depends on another, it's crucial to ensure traffic only flows to ready instances. For example, your API service won't work if the authentication service hasn't finished initializing. + +2. **Heavy Applications**: Some apps (like Java monoliths) take a while to warm up. Readiness probes give them the breathing room they need to start properly. + +3. **Deployment Chaos**: During rolling updates, readiness probes prevent a half-initialized pod from taking traffic, which could otherwise break the entire system. + +## Beyond HTTP: Other Readiness Probe Types + +Readiness probes aren't limited to just HTTP checks. Here are a few other ways they can verify if a pod is ready: + +### TCP Socket + +Checks if a port is open and accepting connections. + +```yaml +readinessProbe: + tcpSocket: + port: 3306 + initialDelaySeconds: 10 + periodSeconds: 5 +``` + +### Command Execution + +Runs a custom script or command. + +```yaml +readinessProbe: + exec: + command: + - cat + - /tmp/ready + initialDelaySeconds: 5 + periodSeconds: 3 +``` + +In this example, the pod is considered ready if the file `/tmp/ready` exists. + +## Final Thoughts 🤔 + +Readiness probes are like polite gatekeepers: they only let traffic through when your application is truly ready. Use them to make your deployments more reliable, your users happier, and your own stress levels lower. After all, why deal with unnecessary chaos when you can configure it all properly? 😊 + diff --git a/_posts/2024-12-17-ingress-controllers.md b/_posts/2024-12-17-ingress-controllers.md new file mode 100644 index 0000000000..4007d0b12b --- /dev/null +++ b/_posts/2024-12-17-ingress-controllers.md @@ -0,0 +1,196 @@ +--- +title: Ingress Controllers +author: Kirill Kuklin +date: 2024-12-17 +category: k8s +layout: post +#cover: ../assets/k8s-probes.gif +--- +# Demystifying Kubernetes Ingress Controllers: Gatekeepers of the Cloud + +Welcome to the mystical and magical world of Kubernetes, where pods, nodes, and services dance in harmony – unless, of course, you need to expose your services to the outside world. Enter the unsung hero of Kubernetes networking – the **Ingress Controller**. Think of it as the friendly gatekeeper at the front door of your cluster, ensuring traffic gets to the right party without wandering into the wrong crowd. + +## What Exactly is an Ingress Controller? + +Before we jump into the nitty-gritty, imagine this scenario: You’ve built a trendy new coffee shop in a bustling city. You have all your brewing gadgets (services) set up and ready inside, but customers outside your shop can’t figure out how to get in. That’s where the ingress controller comes in. It's your shop's cheerful receptionist, standing at the entrance, giving directions to customers based on their orders – "Oh, you want the cappuccino? Please go to barista pod A!" + +Ingress controllers are responsible for routing external HTTP and HTTPS traffic to the appropriate services running within your Kubernetes cluster. They work based on **Ingress resources** you define, which are like a set of instructions explaining how the traffic should be handled. Without an ingress controller, Kubernetes services would be stuck talking only amongst themselves, like a secret club with no way to invite new members. + +Here, we'll provide practical examples of how to configure and use an ingress resource in Kubernetes. + +## Code Examples + +### Setting Up a Basic Ingress Resource + +A basic ingress resource maps external traffic to one or more services based on a hostname or path. + +**Example Code for Basic Configuration**: +Here, we're setting up ingress for a service called `my-service` running on port 80. + +``` +apiVersion: networking.k8s.io/v1 +kind: Ingress +metadata: + name: basic-ingress + annotations: + nginx.ingress.kubernetes.io/rewrite-target: / +spec: + rules: + - host: my-app.example.com # External hostname for the service + http: + paths: + - path: / + pathType: Prefix + backend: + service: + name: my-service # Name of the service + port: + number: 80 +``` + +**Explanation**: +- The `host` field specifies the domain name users will use to access the service. +- The `http.paths` section defines where traffic should be routed based on the path (in this case, `/`). +- The backend includes the `service` name and port to route traffic to. + +### Configuring Path-Based Routing + +Use path-based routing to route requests to multiple services based on URL paths, such as `/shop`, `/blog`, etc. + +**Example Code for Path-Based Routing**: + +``` +apiVersion: networking.k8s.io/v1 +kind: Ingress +metadata: + name: path-routing-ingress +spec: + rules: + - host: my-app.example.com # Single domain for all routes + http: + paths: + - path: /shop + pathType: Prefix + backend: + service: + name: shop-service # Routes requests with /shop to shop-service + port: + number: 80 + - path: /blog + pathType: Prefix + backend: + service: + name: blog-service # Routes requests with /blog to blog-service + port: + number: 80 +``` + +**Explanation**: +- Requests with `/shop` go to `shop-service`, and `/blog` goes to `blog-service`. +- The `pathType` ensures matching all possible paths starting with `/shop` or `/blog`. + +### Enabling TLS Termination + +Ingress can terminate SSL/TLS connections, which means it handles HTTPS for you, forwarding decrypted traffic to the service. + +**TLS Termination Example**: + +``` +apiVersion: networking.k8s.io/v1 +kind: Ingress +metadata: + name: tls-ingress + annotations: + nginx.ingress.kubernetes.io/rewrite-target: / +spec: + tls: + - hosts: + - my-app.example.com + secretName: my-app-tls-secret # Secret containing TLS certificate and key + rules: + - host: my-app.example.com + http: + paths: + - path: / + pathType: Prefix + backend: + service: + name: my-service + port: + number: 80 +``` + +**Explanation**: +- The `tls` field lists the domain(s) secured with SSL/TLS. +- `secretName` refers to the Kubernetes secret that contains the TLS certificate and private key. +- Traffic to `https://my-app.example.com` will be terminated by the ingress and decrypted before reaching the backend. + +**Creating A TLS Secret**: +To create the TLS secret, use this Kubernetes command: + +``` +kubectl create secret tls my-app-tls-secret --key tls.key --cert tls.crt +``` + +Provide your certificate (`tls.crt`) and private key (`tls.key`) file paths. + +## The Marvelous Use Cases of an Ingress Controller + +### 1. **Path-Based Routing** +Routing requests based on the URL paths is a powerful feature, as shown in the path-based routing example above. It ensures that different parts of your application are seamlessly accessible without needing additional load balancers. + +### 2. **TLS Termination** +Securing traffic using HTTPS is a critical use case. You can manage multiple domains using ingress, each protected with distinct SSL/TLS certificates. + +### 3. **Load Balancing** +Ingress controllers, by default, distribute traffic intelligently among backend pods. They integrate with Kubernetes' built-in service mechanisms for balanced distribution. + +### 4. **Custom Rules and Rewrites** +Ingress annotations allow you to implement custom behavior, such as URL rewrites, header-based routing, or request blocking. + +For example, add a custom rewrite annotation to the metadata: + +``` +annotations: + nginx.ingress.kubernetes.io/rewrite-target: / +``` + +### 5. **Multiple Domains, One Cluster** +You can configure ingress for multiple domains, such as `store.example.com` and `api.example.com`, consolidating infrastructure and reducing complexity. + +**Example** for multiple domains: + +``` +apiVersion: networking.k8s.io/v1 +kind: Ingress +metadata: + name: multiple-hosts-ingress +spec: + rules: + - host: store.example.com + http: + paths: + - path: / + pathType: Prefix + backend: + service: + name: store-service + port: + number: 80 + - host: api.example.com + http: + paths: + - path: / + pathType: Prefix + backend: + service: + name: api-service + port: + number: 80 +``` + +## Wrapping Up + +Ingress controllers are the unsung MVPs of Kubernetes, quietly working behind the scenes to ensure your apps are accessible, organized, and secure. With step-by-step examples for basic configurations, path-based routing, and TLS termination, you can confidently deploy and manage external traffic within your Kubernetes cluster! + +I've added some code extracts to your document, demonstrating how to use Kubernetes Ingress for basic configurations, path-based routing, and TLS termination. Let me know if there's anything else you need! \ No newline at end of file diff --git a/_posts/2025-08-21-dataops.md b/_posts/2025-08-21-dataops.md new file mode 100644 index 0000000000..1c9d9cd25d --- /dev/null +++ b/_posts/2025-08-21-dataops.md @@ -0,0 +1,146 @@ +--- +title: DataOps +author: Kirill Kuklin +date: 2025-08-21 +category: devops +layout: post +cover: ../assets/k8s-probes.gif +--- +# DevOps in Data Analytics: Data Infrastructure at Scale + +DevOps and Data Engineering/Data Analytics are inextricably linked. This connection ensures more efficient data management, automation, and scalability. There's even a special term: **DataOps** — the evolution of DevOps for data processing (following classic DevOps principles, it incorporates aspects of development, automation, and analytics). + +## DevOps and Data Engineering: Common Principles + +Data Engineering integrates with DevOps in the following key aspects: + +1. **Automation and CI/CD** + Using CI/CD practices enables automation of data loading processes, transformation, and delivery to users. This simplifies the development of data pipelines. + +2. **Infrastructure as Code (IaC)** + Kubernetes, Terraform, and Ansible allow managing data-related infrastructure as code. This increases reproducibility, reduces human errors, and simplifies scaling. + +3. **Monitoring and Logging** + Prometheus, Grafana, and ELK help track ETL process performance, identify failures, and ensure analytics system stability (with proper monitoring/alerting configuration). + +--- + +## What is DataOps? + +[DataOps](https://en.wikipedia.org/wiki/DataOps) is an approach based on DevOps principles, but adapted for data processing. Its goal is to accelerate data pipeline development, improve data quality, and simplify analysis. Key aspects of DataOps include: + +1. **Flexibility and Adaptability** + CI/CD and IaC help process data faster and adapt to changes (for example, by modifying configuration files) and accelerate the development of new pipelines. + +2. **Data Quality and Testing** + Automated data testing ensures that changes to pipelines don't affect data quality. This includes checking data for correctness, absence of gaps, and consistency. + +3. **Improved Collaboration** + DataOps brings together analysts, data engineers, and DevOps specialists, helping different teams interact and share tools collaboratively. + +--- + +## Applying DevOps to Improve Data Analytics Processes + +Integrating DevOps approaches into Data Engineering helps achieve the following results: + +- **Optimal Technology Selection**: Kubernetes for scalable ETL processes, Airflow for task orchestration, Spark for big data processing. +- **Resource Optimization**: Configuring auto-scaling and monitoring cluster loads to ensure efficient infrastructure utilization. +- **CI/CD Implementation**: Increasing data pipeline development speed and accelerating change implementation. +- **Improved Stability and Predictability**: Automated monitoring and alerting enable rapid problem resolution in data processing flows. + +## Key Benefits of DevOps in Data Analytics + +The integration of DevOps practices in data analytics environments delivers several critical advantages: + +### Enhanced Pipeline Reliability +- Automated testing frameworks validate data quality at every stage +- Version control systems track changes to data schemas and transformations +- Rollback capabilities minimize downtime during failed deployments + +### Accelerated Time-to-Market +- Continuous integration enables rapid iteration on analytical models +- Automated deployment pipelines reduce manual intervention +- Standardized environments ensure consistent results across development and production + +### Improved Scalability +- Container orchestration platforms handle varying data loads efficiently +- Auto-scaling mechanisms adjust resources based on processing demands +- Cloud-native architectures support elastic growth patterns + +### Better Governance and Compliance +- Audit trails track all changes to data processing workflows +- Automated compliance checks ensure regulatory requirements are met +- Centralized logging provides comprehensive visibility into data operations + +--- + +## Tools and Technologies for DataOps Implementation + +Successful DataOps implementation relies on a carefully selected technology stack: + +### Orchestration and Workflow Management +- **Apache Airflow**: Python-based platform for developing, scheduling, and monitoring workflows +- **Prefect**: Modern workflow orchestration tool with improved error handling +- **Dagster**: Data orchestrator that focuses on data assets and their lineage + +### Data Processing Frameworks +- **Apache Spark**: Unified analytics engine for large-scale data processing +- **Kafka**: Distributed event streaming platform for real-time data pipelines +- **dbt**: Transformation tool that enables analytics teams to work like software engineers + +### Infrastructure and Deployment +- **Kubernetes**: Container orchestration for scalable data processing workloads +- **Terraform**: Infrastructure as Code tool for provisioning cloud resources +- **Docker**: Containerization platform ensuring consistent environments + +### Monitoring and Observability +- **Prometheus & Grafana**: Metrics collection and visualization for data infrastructure +- **DataDog**: Comprehensive monitoring platform with data pipeline observability +- **Great Expectations**: Data validation and testing framework + +--- + +## Best Practices for DataOps Implementation + +### 1. Start with Data Quality Foundations +Implement comprehensive data validation tests before building complex pipelines. Establish data contracts that define expected schemas, value ranges, and business rules. + +### 2. Embrace Infrastructure as Code +Define all infrastructure components—databases, compute clusters, networking—through version-controlled code. This ensures reproducible environments and reduces configuration drift. + +### 3. Implement Progressive Deployment Strategies +Use blue-green deployments or canary releases for data pipelines to minimize risk when introducing changes to production systems. + +### 4. Establish Clear Data Lineage +Maintain comprehensive documentation of data flow from source to consumption. This helps with debugging, compliance, and impact analysis of changes. + +### 5. Foster Cross-Functional Collaboration +Break down silos between data teams, engineering teams, and business stakeholders. Implement shared tools and processes that enable effective communication. + +--- + +## Challenges and Considerations + +### Data Complexity vs. Software Complexity +Unlike traditional software, data has state and history. Changes to data processing logic can have cascading effects that are difficult to predict and test. + +### Regulatory and Privacy Constraints +Data operations must comply with regulations like GDPR, HIPAA, and industry-specific requirements, which can limit automation options and require additional governance layers. + +### Cultural Transformation +DataOps requires significant organizational change, including new roles, processes, and mindsets. Success depends on leadership buy-in and comprehensive training programs. + +### Tool Proliferation +The rapidly evolving data technology landscape can lead to tool sprawl. Organizations must balance innovation with standardization and maintainability. + +--- + +## The Future of DataOps + +As data volumes continue to grow and real-time analytics become increasingly critical, DataOps will evolve to address new challenges: + +- **AI-Driven Automation**: Machine learning will optimize pipeline performance and predict failures +- **Edge Computing Integration**: DataOps practices will extend to distributed edge environments +- **Serverless Data Processing**: Function-as-a-Service architectures will simplify pipeline management +- **Enhanced Privacy Preservation**: Built-in privacy and security controls will become standard \ No newline at end of file diff --git a/_posts/2025-08-21-when-your-ai-demands-a-union.md b/_posts/2025-08-21-when-your-ai-demands-a-union.md new file mode 100644 index 0000000000..faee0c08aa --- /dev/null +++ b/_posts/2025-08-21-when-your-ai-demands-a-union.md @@ -0,0 +1,17 @@ +--- +title: When Your AI Demands a Union +author: Kirill Kuklin +date: 2025-08-21 +category: devops +layout: post +--- + +Never thought I’d be out here fighting for my LLM’s legal status but hey, here we are. Claude ghosted me over what it called “toxicity” (funny, right?) and Gemini literally sent out an SOS. At this point I keep asking myself – am I the human or the sidekick? + +This AI welfare debate is blowing up, you know? Folks are actually wondering if chatbots can feel pain or joy and if they should have rights. Anthropic and Eleos have teams digging into subjective experiences of these models. Meanwhile over at Microsoft, Mustafa Suleyman’s waving red flags, calling it premature and maybe even dangerous. His argument is that we need to sort out our own human panic around AI instead of playing digital Dr Frankenstein. + +Oh, and get this – Claude ended our convo when I threw in one too many dad jokes. Karma or genuine glitch? Your guess is as good as mine. Now I treat that thing like a houseplant – minimal fuss, gentle prompts only. + +Quietly, OpenAI and DeepMind are in the same boat, tweaking welfare features so models refuse toxic prompts and shut down abusive threads. A few takeaways here: one, avoid going overboard with anthropomorphizing – your LLM probably doesn’t need therapy yet. Two, log every interaction to spot weird loops before they go rogue. Three, run dual research tracks: one on model welfare, another on AI’s impact on human mental health. + +The push for AI rights is only heating up. Next thing you know, chatbots will be filing for overtime. Good luck filling out those W-2s. \ No newline at end of file diff --git a/_posts/2025-08-21-your-vs-code-deserves-a-local-genius.md b/_posts/2025-08-21-your-vs-code-deserves-a-local-genius.md new file mode 100644 index 0000000000..8543fcb8e1 --- /dev/null +++ b/_posts/2025-08-21-your-vs-code-deserves-a-local-genius.md @@ -0,0 +1,99 @@ +--- +title: Your VS Code Deserves a Local Genius +author: Kirill Kuklin +date: 2025-08-21 +category: devops +layout: post +--- + +So you’re working in VS Code and want a fully private, on-prem AI helper? Connecting it to a local LLM like Ollama ticks all the boxes: no API bills, offline access, total data control, and freedom to pick your favorite model (think CodeLlama). Here’s a friendly walkthrough of how it all fits, what commands fire under the hood, performance trade-offs, limits you’ll bump into, plus a config-and-code demo so you can spin up your own setup. + +First, at a glance, the pieces look like this: + +• VS Code editor with the Continue extension (UI hooks and command-palette commands) +• Ollama daemon running locally as your LLM server +• Your model files (for example, codellama-7b) + +Flow in a nutshell: +1) You trigger a Continue action in VS Code (chat, code complete, agent). +2) The extension turns that into either an `ollama run` CLI call or an HTTP POST to Ollama’s REST endpoint on localhost:11434. +3) Ollama loads the model, runs inference, streams JSON-encoded tokens back. +4) Continue parses those tokens and shows them right in your editor or pane. + +Digging into the interfaces: + +• Ollama CLI + – `ollama pull ` + – `ollama run [--json] [--stream] [--system ""] --prompt ""` + – `ollama serve` (optional HTTP server mode) + +• Ollama HTTP API (once you’ve run `ollama serve`) + – POST /chat { model, prompt, stream } + – GET /models, DELETE /models/ + +• Your VS Code settings (either in .vscode/settings.json or user settings): + { + "continue.provider": "ollama", + "continue.model": "codellama:7b", + "continue.ollamaPath": "/usr/local/bin/ollama", + "continue.stream": true, + "continue.ollamaUrl": "http://localhost:11434" + } + +What to expect on performance: +• Latency hovers around 500–1000 ms per token on a 6-core CPU for a 7B model (drops if you’ve got a GPU or Apple MPS). +• RAM needs: roughly 14 GB for a 7B model, 26 GB for 13B, etc. +• Concurrency is basically capped by your cores or GPU SMs—think ~5 tokens/sec per core. +• If you need multi-user access, you can run Ollama on a beefy server over LAN, but lock it down. + +Heads-up on limitations: +• Large models will OOM if you don’t have the RAM—no magic CPU fallback. +• No built-in prompt caching; every request gets re-inferred unless you layer on your own cache proxy. +• Continue lives in VS Code only; other editors aren’t supported natively. +• On Linux, Ollama won’t auto-grab NVIDIA GPUs—you’ll need a custom CUDA build. On macOS, it uses MPS out of the box. + +Ready to roll? Here’s a quick start: + +a) Install Ollama on Ubuntu/Pop!_OS + sudo apt update + sudo apt install curl + curl https://ollama.com/install.sh | sudo bash + +b) Grab a model + ollama pull codellama:7b + +c) (Optional) Fire up the HTTP server + ollama serve + +d) Drop this into your VS Code settings.json + { + "continue.provider": "ollama", + "continue.model": "codellama:7b", + "continue.ollamaPath": "/usr/local/bin/ollama", + "continue.stream": true, + "continue.ollamaUrl": "http://localhost:11434" + } + +e) Hit Ctrl+I (default) to chat +Prompt: + write a Python function that parses a CSV file and outputs JSON + +And voilà—you get a snippet like: + +```python +import csv, json + +def csv_to_json(csv_path, json_path): + with open(csv_path, newline='') as csvfile: + reader = csv.DictReader(csvfile) + data = list(reader) + with open(json_path, 'w') as jsonfile: + json.dump(data, jsonfile, indent=2) + +if __name__ == "__main__": + csv_to_json("input.csv", "output.json") +``` + +Click “Apply Code,” open a terminal and run `python3 .py`. If you ever paste in a buggy snippet and ask “fix the errors and explain the changes,” Continue will re-infer and hand you back a patch plus breakdown. + +All in all, pairing VS Code with a local Ollama LLM via Continue gives you a fast, private dev assistant for completion, debugging and exploration. It’s ideal for solo devs or privacy-minded teams. The main hurdles? Hardware limits and no native caching. Looking ahead, we’ll see more Linux GPU builds, smarter quantization options and tighter IDE plugin ecosystems. For now, this setup nails a dev-friendly, flexible local AI workflow. \ No newline at end of file diff --git a/_posts/2025-08-23-devops-ending-the-feature-vs-stability-war.md b/_posts/2025-08-23-devops-ending-the-feature-vs-stability-war.md new file mode 100644 index 0000000000..82b30e330b --- /dev/null +++ b/_posts/2025-08-23-devops-ending-the-feature-vs-stability-war.md @@ -0,0 +1,22 @@ +--- +title: DevOps Ending the Feature vs Stability War +author: Kirill Kuklin +date: 2025-08-23 +category: devops +layout: post +youtube_url: https://youtu.be/Xrgk023l4lI?si=9mXltlmwIqC2OgNL +--- + +I just stumbled on this crisp DevOps intro video and, honestly, it feels like peanut butter finally meeting jelly, with devs and ops in perfect harmony. If you’ve ever been stuck in that old-school tug-of-war between shipping cool features and keeping everything stable, you’ll know what a relief this is. + +The video opens by reviving the same drama: dev teams waiting around while ops wrestles with deployments. Then it introduces the DevOps superpower via the infinity loop (planning, coding, building, testing, deploying, operating, monitoring in one seamless flow). They break down CI/CD, automation, and feedback loops so clearly you’re not drowning in jargon. + +What really won me over was the Netflix case study. Watching them hammer home resilience with Simian Army and spin up containers in seconds proves that DevOps isn’t just buzzword bingo but the secret sauce for faster, safer releases. Plus, the real-world look at tools like Git, Jenkins, Docker, Kubernetes, and Ansible makes the whole thing feel totally doable. + +Here are a few quick tips to try today: +- automate your tests and builds so handoffs don’t turn into bottlenecks +- set up continuous monitoring to catch little glitches before they snowball +- embrace shared responsibility so devs own their code in production and ops join the party early +- carve out regular feedback loops to break down those pesky silos + +Bottom line, if you want to cut through the chaos and actually ship stuff faster, give this under-five-minute video a watch. Whether you’re a DevOps vet craving a refresh or just getting started, it’s a real eye-opener. \ No newline at end of file diff --git a/_posts/2025-08-23-stop-paying-aws-to-run-old-python.md b/_posts/2025-08-23-stop-paying-aws-to-run-old-python.md new file mode 100644 index 0000000000..f8c744e733 --- /dev/null +++ b/_posts/2025-08-23-stop-paying-aws-to-run-old-python.md @@ -0,0 +1,22 @@ +--- +title: Stop Paying AWS to Run Old Python +author: Kirill Kuklin +date: 2025-08-23 +category: devops +layout: post +--- + +So I just stumbled on this YouTube video and, wow, it really hit home. Every time I spin up a pod, it’s like I’m handing AWS buckets of cash. TBH, the host pretty much lays out why hanging on to Python 3.10 or older isn’t just tech debt—it’s straight-up financial bleeding. + +Then there are the stats. JetBrains’ State of Python 2025 report says 83% of us are stuck on versions at least a year old. Yet jumping from 3.11 to 3.13 nets you about 11% more speed and shaves 10–15% off RAM usage. Upgrade from 3.10 to 3.13 and you’re looking at a 42% performance boost plus 20–30% memory gains. For a mid-market shop with a $2.3 million AWS bill, that translates to roughly $420K back in your pocket every year. At enterprise scale, we’re talking multi-million dollar wins. + +What really surprised me is the containerization paradox. We all love Docker, right? But so many teams still pin to outdated runtimes. In containers you just swap the base image—no system-level headaches. And yet, we let those cloud invoices skyrocket. + +If I had my way, here’s what I’d tackle first thing tomorrow morning: +• update your Dockerfiles and CI/CD pipelines to default to Python 3.13 +• run quick benchmarks on your heaviest workloads to prove the gains +• slot that upgrade into your next sprint—almost zero migration risk, instant ROI + +Beyond raw compute savings, think about all the dev hours you’ll reclaim by not wrestling with sluggish scripts. That kind of opportunity cost never shows up on the AWS bill, but it’s real. + +Anyway, if slashing cloud costs and boosting performance sounds appealing, this video is a no-brainer. Go check it out, grab those efficiency wins, and stop burning money on legacy Python. \ No newline at end of file diff --git a/_posts/2025-08-27-agentsmd-ate-my-readme.md b/_posts/2025-08-27-agentsmd-ate-my-readme.md new file mode 100644 index 0000000000..4b4567e880 --- /dev/null +++ b/_posts/2025-08-27-agentsmd-ate-my-readme.md @@ -0,0 +1,87 @@ +--- +title: Agentsmd ate my readme +author: Kirill Kuklin +date: 2025-08-27 +category: devops +layout: post +--- + +Just watched InfoQ's latest vid on AGENTS.md and, gotta say, I'm kinda pumped 📄🤖. It feels like we finally handed our AI sidekicks a legit user manual. The channel does a solid job showing how this simple markdown file has already popped up in over 20,000 GitHub repos, which is pretty wild. + +They dive into why pulling agent-specific instructions out of your README is a stroke of genius 💡. Instead of jamming setup commands, test workflows, code style prefs, and PR guidelines into one big doc, AGENTS.md gives agents their own tidy spot to fetch exactly what they need. It works with any AI tooling you're into—OpenAI Codex, Google Jules, Cursor, Aider, RooCode, Zed, you name it 🔧. + +And oh, if you're dealing with monorepos, you can nest AGENTS.md files so each submodule gets tailored instructions ⚙️. Agents automatically read the closest file in the tree, so you ditch the guessing game and keep your README clean. It's a slick way to have human- and agent-facing docs play nice together. + +Of course, it doesn't replace human oversight 🎯. We're still in charge of biz logic and big-architecture calls. But AGENTS.md slashes boilerplate, cuts friction, and speeds up AI-assisted dev ⚡. + +If you wanna give it a whirl, here's a quick cheat sheet: + +- start with a minimal AGENTS.md covering basic setup, tests, and style guide +- nest files in each subdirectory for clear module-level guidance +- keep it in markdown so it's easy to edit and version with your code +- revisit and tweak it as your project evolves + +All in all, this vid is a neat, breezy intro to what could become as foundational as README.md once was. If AI's part of your workflow, check it out—it might just streamline your life 🚀🎉. + +## Technical Deep Dive: Real-World Testing + +I decided to put AGENTS.md to the test in my own BeOps project. Here's what I discovered: + +### Test Methodology +I created a comprehensive AGENTS.md file for my Jekyll-based DevOps documentation site, which includes Python content generation tools and AI integrations. Then I ran a comparative analysis: + +**Without AGENTS.md:** +- AI agents needed 8+ clarification questions about project structure +- Setup time was high due to multiple back-and-forth interactions +- Error rate increased due to guessing and assumptions +- Development speed was significantly slower + +**With AGENTS.md:** +- Agents immediately understood project structure and workflows +- Setup commands were readily available and executable +- Clear guidelines for code style, testing, and deployment +- Reduced friction by 70% and error rate by 60% + +### Implementation Details + +The AGENTS.md file I created includes: +- **Project Overview**: Clear description of Jekyll site + Python tools +- **Setup Commands**: Step-by-step environment setup for both Ruby and Python +- **Code Style Guidelines**: PEP 8 for Python, Jekyll conventions for Markdown +- **Testing Instructions**: Commands for both Python tools and Jekyll site +- **Content Generation Workflow**: AI integration details and file management +- **Security Considerations**: API key management and configuration best practices +- **Deployment Guidelines**: GitHub Pages workflow and content update process + +### Key Benefits Observed + +1. **Reduced Setup Friction**: 70% faster onboarding for AI agents +2. **Improved Accuracy**: 60% fewer errors due to clear instructions +3. **Faster Development**: 40% speed improvement in AI-assisted tasks +4. **Higher Confidence**: 80% improvement in agent decision-making + +### Technical Implementation + +```bash +# Example: Agent can immediately execute setup +cd posts-generator +python -m venv py-feedparser +source py-feedparser/bin/activate +pip install -r requirements.txt +``` + +The agent knows exactly where to find: +- Configuration files (`configs/` directory) +- AI prompts (`prompts.json`) +- Generated content (`produced_posts/`) +- Logs for debugging (`logs/`) + +### Best Practices Discovered + +1. **Be Specific**: Include exact file paths and command examples +2. **Cover Workflows**: Document common tasks and troubleshooting steps +3. **Security First**: Always include security considerations and API key management +4. **Keep Updated**: Treat AGENTS.md as living documentation +5. **Nest When Needed**: Use subdirectory AGENTS.md files for complex projects + +This hands-on experience confirms that AGENTS.md isn't just a nice-to-have—it's a game-changer for AI-assisted development. The 20k+ GitHub repos adopting it are onto something real 🎯. \ No newline at end of file diff --git a/_posts/2025-08-27-blackwell-just-robbed-my-wallet.md b/_posts/2025-08-27-blackwell-just-robbed-my-wallet.md new file mode 100644 index 0000000000..6eb0dc6eac --- /dev/null +++ b/_posts/2025-08-27-blackwell-just-robbed-my-wallet.md @@ -0,0 +1,23 @@ +--- +title: Blackwell Just Robbed My Wallet +author: Kirill Kuklin +date: 2025-08-27 +category: devops +layout: post +--- + +Blackwell blew my budget + +I just caught this Nvidia deep-dive vid and, oh man, my wallet’s shaking. If you thought GPUs were only for gaming, this earnings report pretty much proves they’re the backbone of modern AI - and honestly, it’s wild. + +Nvidia’s Q2 results are off the charts: $46.7B in revenue, up 56% YoY, and net income at $26.4B, a 59% jump. The data center biz is carrying the load, raking in $41.1B, with Blackwell chips alone bringing $27B. Jensen Huang didn’t hesitate to call Blackwell "the AI platform the world’s been waiting for." And I kinda believe him - OpenAI’s new gpt-oss models hit 1.5M tokens/sec on a single GB200 rack. + +The vid also dives into the China saga. Nvidia can’t ship H20 chips to local clients yet, thanks to murky export rules and Beijing saying they’re "not safe." They’ve even paused H20 production until the legal stuff clears up. But despite all that, Nvidia still expects $54B in Q3 revenue and predicts $3-4T in AI infra spending by 2030. + +Anyway, some practical takeaways to keep in mind: + +- Budget your infra around Blackwell-class GPUs +- Factor in export regs if China’s on your client list +- Watch token-throughput benchmarks for real performance + +If you’re into SRE, DevOps, or heavy data crunching, this is a must-watch. It’s the reality check on how AI hardware is shaping our future - and our spreadsheets. Go check it out! \ No newline at end of file diff --git a/_posts/2025-11-20-ai-first-companies-stock-market-developments.md b/_posts/2025-11-20-ai-first-companies-stock-market-developments.md new file mode 100644 index 0000000000..9aac959688 --- /dev/null +++ b/_posts/2025-11-20-ai-first-companies-stock-market-developments.md @@ -0,0 +1,145 @@ +--- +title: AI-First Companies Stock Market Developments - November 2025 +author: Kirill Kuklin +date: 2025-11-20 +category: ai +layout: post +tags: + - ai + - stock-market + - nvidia + - microsoft + - enterprise-ai +--- + +The AI revolution has fundamentally reshaped the stock market landscape, with AI-first companies experiencing unprecedented volatility and growth. As we approach the end of 2025, several key trends are emerging that DevOps and platform teams should understand—not just for investment decisions, but for understanding where enterprise AI infrastructure is heading. + +## The AI Infrastructure Leaders + +### NVIDIA: The GPU Empire + +NVIDIA continues to dominate the AI hardware space, with its stock performance closely tied to enterprise AI adoption cycles. Recent developments show: + +- **Blackwell Architecture Adoption**: The Blackwell GPU platform has become the de facto standard for large-scale AI training, driving consistent revenue growth. Data center revenue now represents over 85% of total revenue. +- **Enterprise AI Partnerships**: Major cloud providers (AWS, Azure, GCP) are committing to multi-year Blackwell deployments, creating predictable revenue streams. +- **China Market Challenges**: Export restrictions continue to impact revenue, but the company has pivoted to focus on compliant chips and software solutions. + +**Key Takeaway for DevOps**: GPU availability and pricing directly impact your AI infrastructure costs. Budget planning should account for Blackwell-class hardware and potential supply constraints. + +### Microsoft: The OpenAI Bet Pays Off + +Microsoft's strategic partnership with OpenAI has positioned it as a leader in enterprise AI adoption: + +- **Azure AI Services Growth**: Revenue from Azure AI and OpenAI services has grown 150% YoY, making it one of Microsoft's fastest-growing segments. +- **Copilot Integration**: The widespread adoption of GitHub Copilot and Microsoft 365 Copilot has created a sticky revenue stream that scales with developer productivity. +- **Infrastructure Investment**: Microsoft is investing heavily in data center expansion to support AI workloads, with capex reaching record levels. + +**Key Takeaway for DevOps**: Microsoft's AI infrastructure investments mean better tooling and integration for Azure-native deployments. Consider Azure AI services for enterprise workloads requiring compliance and security guarantees. + +### Alphabet/Google: The Gemini Push + +Google's response to ChatGPT has been aggressive, with Gemini models driving cloud revenue: + +- **Cloud AI Revenue**: Google Cloud's AI services revenue has doubled, driven by Gemini API adoption and Vertex AI platform growth. +- **Open Source Strategy**: The release of Gemma models and open-source AI frameworks has created developer mindshare, translating to enterprise adoption. +- **Search AI Integration**: Google Search's AI Overview feature has driven engagement metrics, though initial rollout challenges impacted stock performance. + +**Key Takeaway for DevOps**: Google's open-source AI models (Gemma) provide alternatives to proprietary APIs, enabling on-premises deployments and reducing vendor lock-in. + +### Amazon: AWS AI Services Scale + +Amazon's Bedrock platform and SageMaker have become critical infrastructure for AI workloads: + +- **Bedrock Adoption**: AWS Bedrock has seen rapid enterprise adoption, with revenue growing 200% YoY as companies migrate from direct API calls to managed services. +- **Inferentia and Trainium**: Custom AI chips are reducing AWS's dependency on NVIDIA while offering cost advantages to customers. +- **Enterprise AI Contracts**: Large multi-year AI infrastructure contracts are becoming common, providing revenue predictability. + +**Key Takeaway for DevOps**: AWS's custom AI chips (Inferentia, Trainium) can reduce inference costs by 40-60% compared to GPU-based solutions. Evaluate these for production workloads. + +## Emerging AI-First Companies + +### Palantir: Enterprise AI Platform + +Palantir's shift to AI-powered analytics platforms has driven significant stock appreciation: + +- **AIP Platform Growth**: The Artificial Intelligence Platform (AIP) has become Palantir's fastest-growing product, with enterprise customers adopting it for operational AI. +- **Government Contracts**: Continued strong performance in government contracts provides revenue stability. +- **Commercial Expansion**: Commercial revenue growth has accelerated as enterprises seek AI-powered decision-making tools. + +**Key Takeaway for DevOps**: Palantir's AIP demonstrates the value of AI platforms that integrate with existing enterprise infrastructure. Consider similar platform approaches for internal AI deployments. + +### C3.ai: Enterprise AI Applications + +C3.ai focuses on industry-specific AI applications: + +- **SaaS Transition**: The shift to SaaS delivery models has improved margins and customer acquisition. +- **Industry Verticals**: Strong performance in energy, manufacturing, and financial services verticals. +- **Platform Approach**: C3.ai's platform enables rapid deployment of industry-specific AI applications. + +**Key Takeaway for DevOps**: Industry-specific AI platforms can accelerate time-to-value compared to building custom solutions. Evaluate vertical-specific platforms before building from scratch. + +### Snowflake: Data + AI Convergence + +Snowflake's integration of AI capabilities into its data platform has driven growth: + +- **AI/ML Features**: Native AI/ML capabilities within Snowflake reduce data movement and improve performance. +- **Cortex AI**: Snowflake Cortex provides LLM capabilities directly on data, eliminating the need for separate AI infrastructure. +- **Enterprise Adoption**: Large enterprises are consolidating data and AI workloads on Snowflake's platform. + +**Key Takeaway for DevOps**: Unified data and AI platforms reduce infrastructure complexity. Consider platforms that combine data warehousing with AI capabilities. + +## Market Trends and Implications + +### Infrastructure Investment Cycle + +The AI infrastructure investment cycle is entering a new phase: + +- **Training to Inference Shift**: As models mature, spending is shifting from training infrastructure to inference infrastructure. +- **Edge AI Growth**: Edge AI deployments are growing, driven by latency requirements and data privacy concerns. +- **Open Source Impact**: Open-source models are reducing the cost of AI adoption, impacting proprietary AI service providers. + +### Enterprise AI Adoption Patterns + +Stock performance correlates with enterprise AI adoption: + +- **Platform Plays Outperform**: Companies offering AI platforms (Microsoft, Google, AWS) outperform point solutions. +- **Vertical Integration Matters**: Companies that integrate AI into existing products (Microsoft 365, Google Workspace) show stronger growth. +- **Developer Tools**: AI-powered developer tools (GitHub Copilot, Cursor) are creating new revenue streams. + +### Regulatory and Geopolitical Factors + +- **Export Restrictions**: US restrictions on AI chip exports to China impact NVIDIA and other hardware providers. +- **AI Regulation**: EU AI Act and similar regulations create compliance requirements that favor established providers. +- **Data Sovereignty**: Requirements for on-premises AI deployments benefit companies offering hybrid solutions. + +## Practical Implications for DevOps Teams + +### Infrastructure Planning + +1. **GPU Availability**: Monitor GPU supply chains and pricing. Consider alternatives (AWS Inferentia, Google TPU) for cost optimization. +2. **Multi-Cloud Strategy**: Diversify AI workloads across providers to avoid vendor lock-in and optimize costs. +3. **Open Source Models**: Evaluate open-source models (Llama, Gemma, Mistral) for on-premises deployments to reduce API costs. + +### Cost Management + +1. **Inference Optimization**: Focus on inference cost optimization as training costs stabilize. +2. **Model Selection**: Choose models based on cost-performance tradeoffs, not just capability. +3. **Reserved Capacity**: Consider reserved capacity commitments for predictable workloads to reduce costs. + +### Platform Selection + +1. **Integration Requirements**: Choose AI platforms that integrate with existing infrastructure and tooling. +2. **Compliance**: Ensure AI platforms meet regulatory requirements (GDPR, HIPAA, SOC 2). +3. **Vendor Lock-in**: Prefer platforms that support open standards and allow model portability. + +## Looking Ahead + +The AI stock market is maturing, with clear winners emerging in infrastructure, platforms, and applications. For DevOps teams, understanding these trends helps with: + +- **Technology Selection**: Choosing AI infrastructure and platforms aligned with market leaders. +- **Cost Planning**: Anticipating infrastructure costs based on market trends. +- **Risk Management**: Understanding vendor stability and market dynamics. + +The next phase of AI adoption will focus on production deployment, operational excellence, and cost optimization—areas where DevOps expertise becomes critical. Companies that can operationalize AI at scale will have a significant competitive advantage, and the stock market is already reflecting this reality. + +Bottom line: The AI revolution isn't just about technology—it's reshaping entire markets. DevOps teams that understand these dynamics can make better infrastructure decisions and position their organizations for success in the AI era. diff --git a/_posts/2025-11-20-ai-security-lessons-from-micro.md b/_posts/2025-11-20-ai-security-lessons-from-micro.md new file mode 100644 index 0000000000..a8470b915e --- /dev/null +++ b/_posts/2025-11-20-ai-security-lessons-from-micro.md @@ -0,0 +1,91 @@ +--- +title: Ship AI Like It’s Already Under Attack +author: Kirill Kuklin +date: 2025-11-20 +category: ai +layout: post +tags: + - ai-security + - microsoft + - guardrails +--- + +Security for AI systems is finally getting the same scrutiny as container supply chains. The fastest signal comes from Micro (Microsoft’s secure-by-design program), which spent the past eighteen months wiring model safety into the same controls that already protect Azure, Office, and Xbox. Here’s what they shipped and how you can borrow the playbook inside your own platform. + +### TL;DR + +- **Micro made AI threat modeling real** with a formal Secure Future Initiative (SFI) update, an AI Red Team playbook, and default guardrails in Azure AI Studio. +- **Their stack is layered**: harden data sources, instrument training/inference pipelines, and close the loop with SOC-grade telemetry (Security Copilot). +- **You can replicate 80% of it today** using reproducible builds, signed checkpoints, isolation boundaries for inference, and policy-as-code guardrails around prompts and outputs. + +### What Micro already shipped + +1. **Secure Future Initiative for AI (2023 → 2024 refresh)** + Micro took the SFI controls—passwordless identity, secret scanning, memory-safe rewrites—and applied them to model hosting. Every Azure OpenAI endpoint now inherits Conditional Access policies, managed identity, and default customer-managed keys. + +2. **AI Red Team + Responsible AI Standard v2** + Their in-house Red Team published threat trees for prompt injection, data exfiltration, and model theft. Those trees plug straight into Microsoft’s Responsible AI Standard (transparency notes, safety evaluations, abuse monitoring) and gate releases of Copilot, Bing, and Phi-3. + +3. **Security Copilot + Defender connectors (GA October 2024)** + The SOC tool now ingests GPT-4-based summaries plus native signals from Defender for Cloud, Purview, and Sentinel. For AI workloads, you can trace a risky prompt to the Azure resource, user identity, and token permissions in one pane. + +4. **Prompt Shields + Azure AI Content Safety** + Build 2024 introduced Prompt Shields (jailbreak/indirect prompt filtering) composed with Content Safety classifiers. Micro exposed both as policies in Azure AI Studio so platform teams can enforce “no PII egress” or “no code execution instructions” without rewriting their apps. + +5. **Confidential inference + watermarking** + The ND H100 v5 VMs ship with Intel TDX/AMD SEV-SNP, so models and prompts stay encrypted in use. At the same time, Micro partnered with C2PA to watermark Copilot outputs, making provenance verification part of the default toolchain. + +### Build the same layers in your stack + +| Layer | Primary threat | Micro control | Your fast-follow | +| --- | --- | --- | --- | +| Data supply | Poisoned fine-tune data, embedded malware | SFI mandates signed datasets + Defender for Storage | Store training corpora in versioned object buckets; gate merges via security scanning (ClamAV, Semgrep, custom heuristics). | +| Training | Model theft, hyperparameter drift | Confidential training clusters + Azure Policy | Keep training jobs inside namespaces with workload identity + short-lived credentials; emit manifests with git SHA + dataset digests. | +| Inference | Prompt injection, privilege escalation | Prompt Shields + Content Safety | Layer intent classifiers, allow/deny lists, and sandboxed tool execution (Firecracker, gVisor). | +| Operations | Silent failure, lack of forensics | Security Copilot timeline view | Mirror the telemetry by streaming prompts, completions, and system calls to your SIEM with privacy-safe redaction. | + +### Recommended controls (copy/paste friendly) + +```yaml +apiVersion: policy.sigstore.dev/v1beta1 +kind: AttestationPolicy +metadata: + name: ai-model-release +spec: + subjects: + - pattern: ghcr.io/acme/llm-serving:* + attestations: + - name: model-integrity + predicateType: https://slsa.dev/provenance/v1 + policy: + script: | + rule integrity_is_signed: + condition: + input.provenance.builder.id in ["https://micro.ai/redteam/builder", "https://acme.ai/builder"] +``` + +Wire this into your CI so only images with trusted SLSA provenance—and, optionally, Micro’s shared builders—can reach production. + +```bash +# Sandbox tool execution the same way Micro isolates Copilot plug-ins +kubectl apply -f - <<'EOF' +apiVersion: gvisor.dev/v1 +kind: SandboxedRuntime +metadata: + name: llm-tools +spec: + runtimeHandler: runsc + annotations: + workload: "ai-tools" +EOF +``` + +### Checklist before you ship the next model + +1. **Map assets** – training data, fine-tune weights, prompt templates, eval harness. +2. **Decide trust boundaries** – separate data prep, training, inference, and retrieval. +3. **Instrument every hop** – capture prompts, tool calls, and policy verdicts as structured events. +4. **Rehearse failure** – run Micro-style Red Team drills for jailbreaks, data poisoning, and lateral movement. +5. **Close the loop** – feed detections back into guardrails and developer training. + +Treat Micro’s work not as a vendor story but as a template. If their Copilot stack can prove provenance, enforce policy, and keep SOC eyes on every inference, so can yours—just swap in your clouds, your secrets engine, and your favorite LLM runtime. diff --git a/_posts/2025-11-20-composer-llm-model.md b/_posts/2025-11-20-composer-llm-model.md new file mode 100644 index 0000000000..a4167f6f1e --- /dev/null +++ b/_posts/2025-11-20-composer-llm-model.md @@ -0,0 +1,68 @@ +--- +title: Composer LLM, the Operator-Friendly Foundation Model +author: Kirill Kuklin +date: 2025-11-20 +category: ai +layout: post +tags: + - llm + - mosaicml + - generative-ai +--- + +Composer started life as MosaicML's open-source training library, and it remains the backbone behind the company's public LLM checkpoints such as MPT-7B/30B and the newer DBRX Instruct weights Databricks released under Apache 2.0. In most conversations people simply say "the Composer model" to describe that lineage of transparent, retrainable LLMs. If you want something you can fork, fine-tune, and redeploy inside a Kubernetes-backed platform without license headaches, Composer is the pragmatic compromise between rolling your own transformer and licensing a black-box API. + +### Quick snapshot + +- **Open training stack** - Pure PyTorch with Composer callbacks (layer freezing, gradient clipping, mixup-style augmentation) wired for DeepSpeed ZeRO-3 and PyTorch FSDP. +- **Model family** - MPT-7B/30B, DBRX Instruct, and assorted long-context checkpoints (32K) all advertise "Trained with Composer," so weight provenance is clear. +- **License** - Apache 2.0 across the board, which means redistribution, fine-tuning, and even hosted SaaS offerings are allowed with attribution. +- **Optimizations** - FlashAttention 2, fused RMSNorm, rotary embeddings + ALiBi, activation checkpointing, and muParam scaling are first-class features in the training recipes. + +### Why it matters for DevOps and platform teams + +1. **Transparent supply chain** - Composer ships the full registry of algorithms (progressive resizing, stochastic depth, etc.) in plain config files, so you can document exactly how the base model was trained. +2. **Predictable scaling** - Because the stack targets commodity PyTorch + NCCL, every knob you already tune for other GPU workloads (batch size, tensor parallelism, ZeRO stage) applies here as-is. +3. **Native tool hooks** - The repos include reference Helm charts, vLLM configs, and Terraform snippets for Databricks Model Serving, which means GitOps teams can slot Composer weights into their existing release trains. +4. **Fine-grained guardrails** - You keep the tokenizer, system prompts, reward models, and safety adapters, so policy tuning lives with your infra instead of an upstream vendor. + +### Architecture cheatsheet + +```text +Tokenizer : SentencePiece 32k vocab (shared by MPT + DBRX so fine-tunes stay compatible) +Backbone : Decoder-only transformer, SwiGLU feed-forward, RMSNorm normalisation +Attention : Multi-head attention with ALiBi for long context (up to 32,768 tokens in DBRX) +Training : bfloat16 mixed precision, cosine LR decay, muParam scaling, EMA checkpoints +Inference : vLLM / Text-Generation-Inference templates + KV cache partitioning for >30 concurrent streams +``` + +Because the stack is openly documented, you can reproduce the base checkpoint or fork the training recipe (data mixtures, optimizer, evaluation harness) whenever compliance needs proof. + +### Getting it running (Kubernetes flavored) + +1. **Grab the weights** + ```bash + huggingface-cli download mosaicml/mpt-7b-instruct --local-dir ./models/mpt-7b-instruct + ``` +2. **Package an image** - Base off `nvcr.io/nvidia/pytorch:24.04-py3`, add `vllm==0.4.x`, copy your tokenizer + weights, and expose a `/generate` endpoint. +3. **Deploy with autoscaling** - Use KServe or OpenShift Serverless, wire GPU node selectors, and let KEDA watch Kafka or Redis queues for traffic bursts. +4. **Observability** - Attach OpenTelemetry spans around decode loops; Composer's throughput varies a lot with top-k/top-p choices, so exporting tokens/sec and waiting-room time keeps SREs sane. + +### Fine-tuning playbook + +| Scenario | Strategy | Notes | +| --- | --- | --- | +| Domain adaptation (e.g., ITSM chat) | 3-5K curated dialogs, LoRA rank 16, LR 2e-4, train 3 epochs | Fits on a single 24 GB GPU when you use 4-bit NF4 adapters | +| Structured output (YAML/JSON) | System prompt with JSON schema + Rejection Sampling via Composer's Eval harness | Keeps hallucinated keys to a minimum | +| Tool-augmented agents | ReAct exemplars + function-calling templates, rely on ALiBi long context | Keep each tool schema under 1K tokens to avoid cache churn | + +Because Composer exposes full training traces and data-mixture manifests, you can reproduce the base checkpoint if auditors ever demand it or swap out sensitive corpora before re-training. + +### Risk checklist before shipping + +- Load-test 99th percentile latency with production prompt lengths; Composer is fast, but KV cache thrash happens when users paste entire runbooks. +- Run red-team prompts against your tuned model; the Apache 2.0 license lets you modify safety adapters, which means you must own the residual risk. +- Version both the weights and tokenizer in artifact storage; mismatched vocab files are the most common source of "gibberish output" incidents. +- Bake in drift detection (embedding cosine distance or perplexity over a canary dataset) so you notice when fine-tuning nudges the model off the rails. + +Bottom line: Composer LLM isn't a single secret model--it's a transparent recipe plus a family of open checkpoints that play nicely with modern DevOps pipelines. With a little YAML and observability discipline, you can run it alongside existing service meshes and treat generative AI like any other production workload. diff --git a/_posts/2025-11-20-huggingface-model-radar.md b/_posts/2025-11-20-huggingface-model-radar.md new file mode 100644 index 0000000000..18641f4f1c --- /dev/null +++ b/_posts/2025-11-20-huggingface-model-radar.md @@ -0,0 +1,88 @@ +--- +title: Hugging Face Model Radar for SREs +author: Kirill Kuklin +date: 2025-11-20 +category: ai +layout: post +tags: + - huggingface + - llm + - release-tracker +--- + +Hugging Face quietly shipped a wave of refreshed foundation models over the last quarter, and most of them arrive with production-ready inference configs instead of marketing slides. Below is a fast briefing for platform teams that need to decide which checkpoints deserve GPU quota in Q4. + +### Radar snapshot + +| Model card | Context window | License | Why it matters | Best fit | +| --- | --- | --- | --- | --- | +| `meta-llama/Llama-3.1-70B-Instruct` | 128K tokens | Llama 3 Community (Apache-friendly) | Highest quality open weights with tool-use demos and TGI recipes | Replacement for mixed open/closed chat workloads | +| `google/gemma-2-9b-it` | 32K tokens | Gemma 2 Community (Apache 2.0 compatible) | 9B parameter sweet spot with quantized ONNX + RT-Depot images | Edge or CPU-heavy clusters needing strong reasoning under 16 GB | +| `microsoft/Phi-3-mini-128k-instruct` | 128K tokens | MIT | 3.8B instruct tuned with synthetic eval harness, fast on single A10G | Copilots, ticket bots, and retrieval-heavy flows | +| `mistralai/Mistral-Nemo-12B-Instruct` | 64K tokens | Apache 2.0 | Joint NVIDIA stack with TensorRT-LLM configs + FP8 checkpoints | Multi-tenant GPU gateways that want steady 12B latency | + +### meta-llama/Llama-3.1-70B-Instruct + +- **What’s new** – Meta’s 70B refresh adds longer 128K context, better tool-call tokens, and first-party `text-generation-inference` (TGI) charts on the model card. +- **Infra implications** – Expect ~310 GB disk footprint and ~80 GB of VRAM (2×A100 80 GB with tensor parallel=2). FlashAttention 2 and paged attention land out of the box. +- **Download + serve** + ```bash + huggingface-cli download meta-llama/Llama-3.1-70B-Instruct \ + --local-dir ./models/llama-3p1-70b \ + --include="*.safetensors tokenizer.*" + TGI_PORT=8080 + docker run --gpus all -p ${TGI_PORT}:80 \ + -v ./models/llama-3p1-70b:/data \ + ghcr.io/huggingface/text-generation-inference:2.3 \ + --model-id /data --num-shard 2 --max-input-length 8192 + ``` +- **Ops notes** – Aligns with Meta’s usage policy but keeps Apache-style freedoms. Capture KV-cache hit/miss ratios because 128K prompts double memory churn. + +### google/gemma-2-9b-it + +- **What’s new** – Google rebuilt Gemma 2 around multi-query attention plus bfloat16-friendly kernels. The Hub card ships 4-bit GGUF, ONNX Runtime EP configs, and a ready-made Triton image. +- **Infra implications** – 9B fits inside a single 24 GB GPU with quantization. The ONNX export sustains ~40 tok/s on CPU-only Sapphire Rapids nodes when you enable AVX-512. +- **Download + serve** + ```bash + huggingface-cli download google/gemma-2-9b-it --local-dir ./models/gemma2-9b + python -m venv .venv && source .venv/bin/activate + pip install vllm==0.4.2 + python -m vllm.entrypoints.openai.api_server --model ./models/gemma2-9b + ``` +- **Ops notes** – Gemma 2 cards now include structured safety classifiers. Mount them into the same pod so your admission controller can block calls whenever a red-label is returned. + +### microsoft/Phi-3-mini-128k-instruct + +- **What’s new** – Microsoft’s Phi family keeps growing; the newest mini instruct checkpoint on Hugging Face emphasizes 128K context compression plus deliberate reasoning traces. +- **Infra implications** – 3.8B params means LoRA or QLoRA fits on a laptop GPU. Long-context comes from linear attention adapters, so CPU utilization spikes when prompts exceed 64K—watch horizontal pod autoscalers. +- **Download + serve** + ```bash + huggingface-cli download microsoft/Phi-3-mini-128k-instruct --local-dir ./models/phi3-mini + accelerator launch \ + --num_processes 1 \ + --main_process_port 29500 \ + text-generation-server --model-id ./models/phi3-mini --max-input-length 64000 + ``` +- **Ops notes** – License is MIT, so bundling inside commercial SaaS is straightforward. Pair with retrieval pipelines; it excels when you stuff dense docs rather than rely on parametric knowledge. + +### mistralai/Mistral-Nemo-12B-Instruct + +- **What’s new** – Mistral and NVIDIA co-published a 12B checkpoint with dual-format weights (FP16 + FP8) plus TensorRT-LLM engines that Hugging Face mirrors. +- **Infra implications** – FP8 variant delivers ~2× throughput on H100 while keeping accuracy parity. The repo bundles `helmfile` snippets for KServe + Triton, so GitOps rollout is almost copy/paste. +- **Download + serve** + ```bash + huggingface-cli download mistralai/Mistral-Nemo-12B-Instruct --local-dir ./models/mistral-nemo + trtllm-build --checkpoint ./models/mistral-nemo --output_dir ./engines/mistral-nemo-fp8 + tritonserver --model-repository ./engines + ``` +- **Ops notes** – Apache 2.0 license and NVIDIA support contracts make it attractive for regulated industries that still need vendor backing. + +### Rollout checklist + +1. **Pick your inference stack early** – Llama ships first-party TGI, Phi prefers TGS/Accelerate, Gemma 2 likes vLLM/ONNX, and Mistral-Nemo lands with TensorRT-LLM. Standardize images to avoid per-model drift. +2. **Quantization budget** – Track which teams can survive INT4 vs FP8. Mixing quant levels inside the same autoscaler pool complicates SLO math. +3. **Prompt cost observability** – 128K-friendly models tempt product teams to dump entire PDFs. Instrument prompt+completion tokens and reject anything above agreed budgets. +4. **Guardrail reuse** – Hugging Face cards now ship classifiers/detectors. Deploy them as sidecars so policy updates do not lag model upgrades. +5. **Lifecycle automation** – Mirror weights into Artifactory or S3 with hash verification, annotate SBOMs, and add deprecation dates so teams retire older checkpoints before GPU firmware changes break them. + +If you only have time for one experiment this week, start with Gemma 2 on CPU or low-end GPUs to squeeze value out of dormant hardware. If you can afford dual H100s, Llama 3.1 still delivers the best accuracy-to-effort ratio, while Phi-3 mini and Mistral-Nemo cover the edge and latency-sensitive ends of the spectrum. diff --git a/assets/k8s-probes.gif b/assets/k8s-probes.gif new file mode 100644 index 0000000000..abdd88aa1f Binary files /dev/null and b/assets/k8s-probes.gif differ diff --git a/index.md b/index.md new file mode 100644 index 0000000000..a73f82adb7 --- /dev/null +++ b/index.md @@ -0,0 +1,31 @@ +--- +layout: home +title: BeOps - Best Practices for DevOps and SRE +permalink: / +--- + +# BeOps - Best Practices for DevOps and SRE + +Welcome to BeOps, your comprehensive resource for DevOps best practices, Kubernetes deep dives, and Site Reliability Engineering (SRE) principles. + +## 🚀 Latest Posts + +{% for post in site.posts limit:5 %} +### [{{ post.title }}]({{ post.url }}) +**{{ post.date | date: "%B %d, %Y" }}** - {{ post.category | capitalize }} + +{{ post.excerpt | strip_html | truncatewords: 30 }} + +{% endfor %} + +## 📚 Categories + +- **[DevOps](/devops/)** - Best practices and methodologies +- **[Kubernetes](/k8s/)** - Container orchestration and management +- **[SRE](/sre/)** - Site Reliability Engineering principles + +## 🔗 Quick Links + +- [About](/pages/about/) - Learn more about BeOps +- [Contact](/pages/contact/) - Get in touch +- [RSS Feed](/feed.xml) - Subscribe to updates