Weekly Bullet #41 – Summary for the week

Here are a bunch of Technical / Non-Technical topics that I came across recently and found them very resourceful.

Technical :

  • System Design – Designing a Ticket Booking Site Like Ticketmaster is the most common system design question – link
  • UberEngineering blog on Anomaly detection and alerting system – link
  • P99 CONF 2023 | Always-on Profiling of All Linux Threads by Tanel Poder – YouTube link
  • On choosing Golang as a programming language at American Express- link
  • [Gold] : Best Papers awarded in Computer Science – sorted Yearly and Topic wise – link
  • Writing an Engineering Strategy – link
    • “If there’s no written decision, the decision is risky or a trap-door decision, and it’s unclear who the owner is, then you should escalate!”
  • ThePrimeTime is a popular youtuber who is a SWE at Netflix. Here is his take on Leetcode – YouTube Link
  • LWN is one of the few sane NL left out there. Here is their take on 2023 – link
  • The Hacker News Top 40 books of 2023link (60+ books)

Non-Technical :

  • HBR – 5 Generic reasons why people get laid off – link
  • Paul Graham is one of the most clear thinkers. Here are his most recommended books – link
  • Geeks being geeks – Why does a remote car key work when held to your head/body? – Detailed analysis link
  • Book extract

“The more we want it to be true, the more careful we have to be. No witness’s say-so is good enough. People make mistakes. People play practical jokes. People stretch the truth for money or attention or fame. People occasionally misunderstand what they’re seeing. People sometimes even see things that aren’t there.

The Demon-Haunted World: Science as a Candle in the Dark – Carl Sagan

See you next time!

[Kubernetes]: CPU and Memory Request/Limits for Pods

In this write up, we will try and explore how to make the most out of the resources in K8s cluster for the Pods on them.

Resource Types:

When it comes to resources on Kubernetes cluster, they can be fairly divided in to two categories:

  • compressible:
    • If the usage of this resource for an application goes beyond the max, it can be throttled without directly killing the application/process.
    • example : cpu – if a container consumes too much of compressible resource, they are throttled
  • non-compressible:
    • If the usage of this resource goes beyond max, it cannot be directly throttled. Might lead to killing of process.
    • example : memory – if a container consumes too much of non-compressible resource, they are killed.

For each pod on a k8s, there are mainly 4 types of resources which need tuned and management based on the application running:
CPU, Memory, Ephermal-storage, Hugepage-<size>

Each of the above mentioned resource can be managed at Provisioning level and Cap usage level on K8s. That is where requests/limits in K8s come in handy.

Request/Limits:

Requests and Limits are the important part of Resource management for Pods and containers.

Requests: is where you define how much of resource your pod needs, when it is getting scheduled on worker node.
Limits: is where you define what is the max value that the resource can stretch to, when consuming the resource on worker node.

Lets consider the deployment yaml file for a application which has request/limit defined on cpu and memory.
It is important to note that when a pod is provisioned on a worker node by kubernetes scheduler, the value mentioned in requests is taken into consideration. The worker node needs to have the amount resource described in requests field for the pod to be scheduled successfully

apiVersion: apps/v1
kind: Deployment
metadata:
name: my-app
spec:
template:
spec:
containers:
- name: app1
resources:
requests:
memory: "64Mi"
cpu: "250m"
limits:
memory: "500Mi"
cpu: "800m"
metadata:
annotations:
link.argocd.argoproj.io/external-link: app.argo.com/main/


At a high level, the concept of requests/limits is similar to soft/hard limits for resource consumption. (more like xms/xmx in Java). These values are generally defined in the deployment file for the pod.
It is an option to either set Request/Limits individually or skip them altogether(based on the kind of resource). If the requests and limits are set incorrectly, this could lead various issues like:

  • pod instability
  • workers being underused
  • incorrect configuration for compressible and non-compressible resources.  
  • worker nodes being over-committed.
  • affecting directly the quality of service for a pod (Burstable, Best-Effort, Guaranted)

Now lets try and fit in different Requests/Limit metrics for CPU and Memory resources for an application deployed on K8s cluster

CPU :

  • CPU is a compressible resource – can be throttled.
  • It is an option to NOT set Limit on CPU. In that case, if there is more CPU available on the worker note, unused, the pods without limits can over-commmit and use the available CPU.
  • It is an option to not set Limit for resources which are compressible, because they can be throttled when there is worker needs the memory back.
  • If your application needs guaranteed Quality of Service, then set the Request==Limit
  • Below is general plot of Request/Limits for CPU

Memory :

  • Memory is a non-compressible resource – cannot be throttled. If a container uses more memory, it will be killed by the kubelet.
  • You cannot ignore Limits like in CPU resource because when the memory need of the app increases, it will over-commit and affect the worker node.
  • Values for limits and requests based on the application needs and tuned based on production feedback of the container.
  • If your application needs guaranteed Quality of Service, then set the Request==Limit
  • Below is general plot of Request/Limits for CPU

Resources for further reading:

  • Request/Limits – Kubernetes docs : here
  • Quality of Services classes for pods in Kubernetes – docs : here
  • Resource types in Kubernetes docs : here

[DDIA Book] : Data Models and Query Languages

[Self Notes and Review]:

This is a second writeup in the series of reading DDIA book and publishing my notes from the book.
The first one can be found here

This particular article is from the second chapter of the book. Again, these are just my self notes/extracts and treat this more like an overview/summary. Best way is to read the book in itself.

This chapter dwells in to the details of: the format in which we write data to databases and mechanism by which we read it back.


First – a few terminologies:

  • Relational Database – which has rows and columns and a schema for all the data
    • Eg: SQL
  • Non-Relational Database – also known as Document model , Nosql etc , targeting the use case of data comes in self-contained documents and relation between one document to other are rare.
    • Eg: Mongo, where the data is stored as a single entity like json object in Mongo
  • Graph Database – where all the data is stored as a vertex and edge, targeting the use case of anything is potentially related to everything.
    • Eg: Neo4j, Titan, and InfiniteGraph
  • Imperative language – in an Imperative language like a programming language, you tell the compute what to do and how to do. Like, get the data and go over the loop twice in a particular order etc
  • Declarative language – in Declarative query language like SQL used for retrieving data from database, you know tell it what to do – and how to do is decided by the query optimizer.

Relational Model(RDBMS) vs Document Model

  • SQL is the best know RDBMS which has lasted for over 30years.
  • NOSQL – is more of an opposite of RDBMS. And sadly the name “nosql” doesn’t actually refer to any particular technology. It is more of a blanket terms for all non-relational databases.
  • Advantages of Document (Nosql) over Relational (RDBMS) type databases:
    • Ease to scale out in no sql like mongo, where you can add more shards – but in sql type database (relational rdbms type) – they are designed more to scale vertically.
    • ability to store unstructured, semi structured or structured data in nosql – while in rdbms you can store only structured data
    • Ease of updating schema in no sql – like in mongo you can insert docs with new field and it will work just fine
    • you can do blue green deployment in nosql by updating one cluster at a time, but in nosql – you have to take down the system
    • https://www.mongodb.com/nosql-explained/advantages
  • Disadvantages of Document (Nosql) over Relational (RDBMS) type databases
    • you cannot directly pick a value from a nested json in document db(you need nested references). In Relation db, you can pick a specific value from its column and row criteria.
    • The poor support for joins in document databases may or may not be a problem, depending on the application.

[Use-case]: Relational vs Document models for implementing linkedin design:

Source : DDIA
  • Relational Model
    • in a relational model like SQL – many user_id can be used as an unique identifier across multiple tables
    • region and industries are common tables which can be used for different users
    • IMPORTANT : in the above example – users table has region_id and industry_id – i.e, it has an id and not the common free text.
      • This helps maintain consistency and avoid ambiguity/duplications. Greater Boston Area will have a single id and the same will be used for all the profiles that match it.
      • This also helps updating cases, in which case you will have to update only one place (regions table) and the same will take effect for all users.
      • The advantage of using an ID is that because it has no meaning to humans, it never needs to change: the ID can remain the same, even if the information it identifies changes. Anything that is meaningful to humans may need to change sometime in the future
      • “Unfortunately, normalizing this data requires many-to-one relationships (many people live in one particular region, many people work in one particular industry), which don’t fit nicely into the document model. In relational databases, it’s normal to refer to rows in other tables by ID, because joins are easy. In document databases, joins are not needed for one-to-many tree structures, and support for joins is often weak
      • On Document db(nosql), we don’t strongly support joins, you will have to pull all the data to the application and do post processing of joins within the application. This can be expensive some times.
  • Document model
{
  "user_id":     251,
  "first_name":  "Bill",
  "last_name":   "Gates",
  "summary":     "Co-chair of the Bill & Melinda Gates... Active blogger.",
  "region_id":   "us:91",
  "industry_id": 131,
  "photo_url":   "/p/7/000/253/05b/308dd6e.jpg",
  "positions": [
    {"job_title": "Co-chair", "organization": "Bill & Melinda Gates Foundation"},
    {"job_title": "Co-founder, Chairman", "organization": "Microsoft"}
  ],
  "education": [
    {"school_name": "Harvard University",       "start": 1973, "end": 1975},
    {"school_name": "Lakeside School, Seattle", "start": null, "end": null}
  ],
  "contact_info": {
    "blog":    "https://www.gatesnotes.com/",
    "twitter": "https://twitter.com/BillGates"
  }
}
  • Details on Document model:
    • a self contained document created in json format for the same schema detailed in above section and stored as a single entity
    • the lack of schema in document model makes it easy to handle data in application layer
    • document db follows a one to many relation model for the data of a user – where all the details of the user is present in the same object locally in a tree structure.
    • In document db, an object is read completely at once. If the size of each object is very large, it is counter productive. So it is recommended to keep the objects small and avoid write to the same objects which will increase its size.

Query Optimizer:

  • Query Optimizer : When you fire a query which has multiple parts – where clause, from clause etc, the query optimizer decides which part to execute first in the most optimized way. These choices are called “access paths” which are decided by query optimizer. A developer will not have to worry about the access path as they are decided automatically. When a new index is introduced, the query optimizer makes a decision on if using it will be helpful, and takes that path automatically.
  • The sql doesn’t guarantee the results in any particular order. “The fact that SQL is more limited in functionality gives the database much more room for automatic optimizations.”

Schema Flexibility:

  • in case of document db although they are called schemaless, that only means, there is an implicit schema for the data, just that it is not enforced by the db.
  • more like schema on read is maintained, rather than schema on write – meaning, when you read the data from the db in document db, you expect some kind of structure and the fields to exist on it.
  • when the format of the data changes, example: full name has to be split in to firstname and lastname – it is much easier on document DB, where the old exists as is, and the new data will have the new field. But in case of Relational database, you will have to perform migration of the schema for pre-existing data.

Graph like Data-models:

  • Disclaimer : I have just skimmed through this section, as I have not directly worked through on Graph model dbs.
  • when the data is of the type many-to-many relationship, then modeling data in form of graph makes more sense.
  • Typical examples for graph modeling usecases.
    • Social medias – linking people together.
    • Rail networks
    • Web pages linked to each other
  • Structuring data of a couple in a graph like model
Source: DDIA book

Summary:

  • Historically, data started out being represented as one big tree
  • Then Engineers found that most of the data is related to each other with many-to-many relationship. So Realtion Model (SQL) was invented
  • More recently, developers found that some applications don’t fit well in the relational model either. New nonrelational “NoSQL” datastores have diverged in two main directions:
    • Document databases target use cases where data comes in self-contained documents and relationships between one document and another are rare.
    • Graph databases go in the opposite direction, targeting use cases where anything is potentially related to everything
  • All three models (document, relational, and graph) are widely used today, and each is good in its respective domain.
  • One thing that document and graph databases have in common is that they typically don’t enforce a schema for the data they store, which can make it easier to adapt applications to changing requirements
  • Each data model comes with its own query language or framework. Examples: SQL, MapReduce, MongoDB’s aggregation pipeline, Cypher, SPARQL, and Datalog

Weekly Bullet #40 – Summary for the week

Here are a bunch of Technical / Non-Technical topics that I came across recently and found them very resourceful.

Technical :

  • The shortest and comprehensive System Design Template for any new service – link here
  • Kafka is one of the most efficiently built transient datastore. This article explains the compute and storage layers of kafka — link here
  • Consistent Hashing has helped solve Distributed System with:
    • even shard distribution across nodes in cluster
    • minimum data movement on adding/removing nodes from the cluster
    • A great explaination of consistent hashing on the link here
  • Picking a database is a long-term commitment. Below is very high level guiding-post. Please take it with a pinch of salt.
Source: ByteByeGo – Big archive
  • I have been geeking out on rate limiting and how it is implemented on large scale systems. Below are a few interesting references for the same:
    • Stripe rate limiter : Scaling your API with rate limits — link here
    • AWS : Throttle API requests for better throughput– link here
    • Rate limiters set at Twitter — link here
    • Out of all the available Rate limiting algorithms (Token bucket, Leaking bucket, Fixed window, Sliding Window etc) – Sliding Window is the most comprehensive which handles burst load — Sliding Window Explained here

Non-Technical :

  • Elon Musk’s biography by Walter Isaacson is out this week. Guess the first sentence in the book? — Amazon link here. Also one of the reviews here
    “I re-invented electric cars and am sending people to mars…did you think I was also going to be a chill, normal dude?”
  • The Project Gutenberg Open Audiobook Collection and you can listen to those audio books on your spotify — link here
  • The difference between Measuring and Evaluating — link here
  • Quote from a book:

Did the person take 10 minutes to do their homework? Are they minding the details? If not, don’t encourage more incompetence by rewarding it. Those who are sloppy during the honeymoon (at the beginning) only get worse later

Tools of Titans

Cheers until next time !

Weekly Bullet #39 – Summary for the week

Here are a bunch of Technical / Non-Technical topics that I came across recently and found them very resourceful.

Technical :

  • AI-Powered Search and Chat for AWS Docs. The best way of consuming AWS docs. — link here
  • A Framework for Thinking About Systems Change – link here
  • BookWyrm – Yet another attempt on building a social media based on book. link here
  • Applying new hardware advancements and benchmarking variants of old databases — link here
    • Cost per Gigabyte on RAM is much lower now than it used to be decade ago.
    • Alternative approaches considered:log-less databases, single-threaded databases, and transaction-less databases, for certain use cases.
  • I have been re-reading the very famous book “Designing Data-Intensive Applications” by Martin Kleppmann. I am publishing my notes and extracts from the book — link here

Non-Technical :

  • Speed matters: Why working quickly is more important than it seems — link here
  • With “Openhimer” being released past week, did you notice a common theme across Nolan’s movies? — Tweet here
  • How to Do Great Work? – Paul Graham – link here
    • This is “The Best” longform article that I have read in years. Below are a few extracts from the same:
    • “The way to figure out what to work on is by working. If you’re not sure what to work on, guess. But pick something and get going.
      “Develop a habit of working on your own projects. Don’t let “work” mean something other people tell you to do.”
      “When in doubt, optimize for interestingness. But a field should become increasingly interesting as you learn more about it.”
      People who do great work are not necessarily happier than everyone else, but they’re happier than they’d be if they didn’t.”

Cheers until next time !

[DDIA Book]: Reliable, Scalable and Maintainable Application

[Self Notes and Review]:

This is a new series of publications where I am publishing my self notes/extracts from reading the very famous book – DDIA (Designing Data-Intensive Applications) by Martin Kleppmann.

This particular article is from the first chapter of the book. Again, these are just my self notes/extracts and treat this more like an overview/summary. Best way is to read the book in itself.

Side note: I am a terribly slow and repetitive reader. The update between chapters might take weeks.

Reliable, Scalable and Maintainable Applications

  • CPU not a constrain any more in computing. CPUs these days are inexpensive and more powerful.
  • General problems these days are complexity of data, amount of data and rate at which the data changes.
  • Below are the common functionalities of a data intensive application
    • Store data so that they, or another application, can find it again later (databases)
    • Remember the result of an expensive operation, to speed up reads (caches)
    • Allow users to search data by keyword or filter it in various ways (search indexes)
    • Send a message to another process, to be handled asynchronously (stream processing)
    • Periodically crunch a large amount of accumulated data (batch processing)
  • [not imp but note]: “Although a database and a message queue have some superficial similarity—both store data for some time—they have very different access patterns, which means different performance characteristics, and thus very different implementations.
    • “there are datastores that are also used as message queues (Redis), and there are message queues with database-like durability guarantees (Apache Kafka). The boundaries between the categories are becoming blurred.”

Reliability

  • “The system should continue to work correctly (performing the correct function at the desired level of performance) even in the face of adversity (hardware or software faults, and even human error).”
  • “The things that can go wrong are called faults, and systems that anticipate faults and can cope with them are called fault-tolerant or resilient.”
  • “Note that a fault is not the same as a failure [2]. A fault is usually defined as one component of the system deviating from its spec, whereas a failure is when the system as a whole stops providing the required service to the user”
  • Every software system must be designed to tolerate some kind of failures rather than preventing every – but some kind of failures are better prevented – Example: Security related failures.
  • Hardware Faults
    • hard disk crash, RAM fautly, power grid failure, someone unplugging wrong network cable
    • “Hard disks are reported as having a mean time to failure (MTTF) of about 10 to 50 years [5, 6] Thus, on a storage cluster with 10,000 disks, we should expect on average one disk to die per day.
    • for hardware failure the first solution is build redundency in the software to handle the failure of one hardware component. Having replicas. Also, Being software tolerant for hardware failures, Example: make the system read only when more than 2/3 nodes are down.
    • “On AWS it is fairly common for virtual machine instances to become unavailable without warning [7], as the platforms are designed to prioritize flexibility and elasticity over single-machine reliability.
  • Software Faults
    • There can be systemic errors in the system that can cause all the nodes of a cluster to go down as a repel effect. Example: 1 nodes on DB cluster – and all the heavy queries that killed the node1 are now shifted node2. The cluster now has one less node but has to deal with all the load – leading to failure of other nodes.
    • “The bugs that cause these kinds of software faults often lie dormant for a long time until they are triggered by an unusual set of circumstances.
    • “There is no quick solution to the problem of systematic faults in software. Lots of small things can help: carefully thinking about assumptions and interactions in the system; thorough testing; process isolation; allowing processes to crash and restart; measuring, monitoring, and analyzing system behavior in production.
  • Human Errors
    • “Even when they have the best intentions, humans are known to be unreliable”
    • 10-25% of internet outages are due to wrong configuration by humans.
    • Some ways to consider in design
      • “Design systems in a way that minimizes opportunities for error. For example, well-designed abstractions, APIs, and admin interfaces make it easy to do “the right thing” and discourage “the wrong thing.” However, if the interfaces are too restrictive people will work around them, negating their benefit, so this is a tricky balance to get right.”
      • A staging env for people to try , explore and fail
      • Testing deeply
      • Make the recovery easy – roll back should be always faster
      • “Set up detailed and clear monitoring, such as performance metrics and error rates. In other engineering disciplines this is referred to as telemetry.”

Scalability

  • “As the system grows (in data volume, traffic volume, or complexity), there should be reasonable ways of dealing with that growth.”
  • one common reason for degraded performance of an application is – higher load/users that the system is designed for. Applications handling more data than it did before.
  • Questions to consider during the design of a scalable application: “If the system grows in a particular way, what are our options for coping with the growth?” and “How can we add computing resources to handle the additional load?”
  • Consider Twitter system design solution for scalability
    • Twitter has two main operations –
      • (1) Post a Tweet – (4.6k requests/sec on average, over 12k requests/sec at peak)
      • (2) Pull the timeline – (300k requests/sec).
    • So, most of the operations are around pull timeline – i.e, reading the tweets. Twitter’s challenge is not around handling the number of people who tweet, but around number of people who read and pull those tweets on their timelines.
    • There are two ways to implement the solution.
      1. everytime someone tweets, write it to a DB. When the follower pull their timeline, pull that tweet from the DB
      2. every time someone tweets, deliver it to all their followers more like a mail – keep it some each user cache. So when the followers pull the timelines, the tweets come from their cache instantly.
    • Option 2 is more effective because – the number of people who are tweeting are less, but the number of people who are pulling the timeline are more. But this means there will be more work now when tweeting, second order effect.
      • lets say, I have 10million followers. So when I tweeet, I have to update the cache of 10million followers, so that when they pull their timeline, the tweets are ready.
      • to avoid that – a hybrid model can be followed. If the user has more than, lets say, 5million followers – update their tweets to a common cache. So when the user pull the timeline – use both option 1 and 2 and merge them based on the people they are following.
  • Average response times – and why you should avoid “average” in general.
    • Avrg are the worst. They take in to account all the outliers and screw up the avrg number reported. Avrg doesn’t tell you how many users actually experienced the delay.
    • Average is nothing but the arithmetic mean. (given n values, add up all the values, and divide by n.)
    • “Usually it is better to use percentiles. If you take your list of response times and sort it from fastest to slowest, then the median is the halfway point: for example, if your median response time is 200 ms, that means half your requests return in less than 200 ms, and half your requests take longer than that”
      • note : Median is same as P50 – 50th percentile
    • “if the 95th percentile response time is 1.5 seconds, that means 95 out of 100 requests take less than 1.5 seconds, and 5 out of 100 requests take 1.5 seconds or more”
    • also, Latency and Response time are not the same. Response time is what the client sees(processing time+network time+client render time). Latency is however the time spent by the request waiting to be served. Latent – awaiting service.
    • more on percentile – here
  • Two imp questions to answer during performance testing
    • If I increase the load without increasing the system resources, how is the performance of the system affected? Is it usable at all?
    • How much of the resources and what all services have to be scaled when the load increases, so that the performance of the application is not degraded?
  • “a system that is designed to handle 100,000 requests per second, each 1 kB in size, looks very different from a system that is designed for 3 requests per minute, each 2 GB in size—even though the two systems have the same data throughput

Maintainability

  • I thought this section made a lot of obvious commentary. Skimmed and skipped most.
  • “Over time, many different people will work on the system and they should be able to work on it productively”
  • Has three parts to it. Operability, Simplicity, Evolvability
  • Operability
    • Monitoring, Tracking, Keeping software uptodate, SOPs, Security updates, Following best practices, Documentation
  • Simplicity
    • reduce accidental complexity by introducing good abstraction.
  • Evolvability
    • services should be created independent
    • follow microservice and 12 Factor app rules to keep them truly independent

See you in the next chapter.

Weekly Bullet #38 – Summary for the week

Here are a bunch of Technical / Non-Technical topics that I came across recently and found them very resourceful.

Technical :

  • Don’t turn that Swap off yet. In defense of swap – common misconceptions. – link here
  • Why histogram and how are they useful. link here
    • If you have used any of the monitoring or APM tools, you would have come across Histogram form of metric being emitter. This writeup give a brief on why Histogram.
  • S3 isn’t getting cheaper – link here
  • [Podcast- 35mins]: Moving from CEO back to IC – with Mitchell Hashimoto, cofounder of HashiCorplink here
  • Inside Datadog’s $5M Outage – link here
  • Book recommendation – Being Geek by Michael Lopp. I have been geeking out on Michael quite a bit these days and he has solid books and advices across the internet.

Non-Technical :

  • “A Checklist For First-Time Engineering Managers” – link here
  • Any lofi lovers here? Best lofi with air traffic control radio mix. My new WFH companion for background audio — link here
  • This is such a pleasant read on WFH day and the pace of it on a rand community channel, almost therapeutic – link here
  • An extract from the book:

And like it or not, your boss is judging you by these three criteria:

COMMITMENT
ATTENTION TO DETAIL
IMMEDIATE FOLLOW UP

Mark H. McCormack

Cheers until next time !

Weekly Bullet #37 – Summary for the week

Here are a bunch of Technical / Non-Technical topics that I came across recently and found them very resourceful.

Technical :

  • A new book released on the hot eBPF – “Learning eBPF” by Liz Rice. This a summary form and a quick introduction to eBPF capabilities when compared to “BPF Performance tools” by Brendan Gregg.
  • To understand latency in detail – “Everything You Know About Latency Is Wrong” – link here
  • Why Percentiles Don’t Work the Way You Think ? Reason to stop using Average value of metrics and how percentiles work. Link here
  • Here is a detailed and practical resource on System Design in Software Engineering – link here
  • “Effective and Efficient Observability with OpenTelemetry” – opentelemetry is the way to go when it comes to traces/metrics observability in your code base. – link here
  • This is a lovely visualization – “Visualizing Lucene’s segment merges” – link here
  • [Podcast] : What’s new in Go 1.20 – Podcast link , release notes

Non-Technical :

  • 50 Ideas that changed my life – David Perell. Link here . My fav one from the 50 is:

    By reading this, you are choosing not to read something else. Everything we do is like this. Doing one thing requires giving up another. Whenever you explicitly choose to do one thing, you implicitly choose not to do another thing.
  • Mental Liquidity – ability to quickly abandon previous beliefs when the world changes or when you come across new information. link here
  • “Two types of Software Engineers” – One assumes it’s easy because it’s a non-technical problem, the other assumes that’s why it’s hard – link here
  • [Recommended] – “What you give up when moving into engineering management” – Link here
  • Quote from a book:

I write everything down, and since I put my notes where they will pop up again in the right place at the right time, once I have written something down I forget about it. The end result is that when I break from work, I break from work-related stress as well.

What They Don’t Teach You at Harvard Business School

Cheers, until next time!

Weekly Bullet #36 – Summary for the week

Here are a bunch of Technical / Non-Technical topics that I came across recently and found them very resourceful.

Technical :

  • [Video-57mins] : What is Continuous Profiling in Performance monitoring and What is Pyroscope – with Ryan Perry – link
  • Go 1.20 is here(link). A thread on all the changes – here
  • “What’s the best lecture series you’ve seen?” – Thread link
  • Some great side project idea on the thread – here. My fav is PlainTextSports
  • EC2 and cost parameters on AWS – More such single slide explaination here

Non-Technical :

  • [Video-6mins]: How to double your Brain Power – Tiago Forte, the author of the book – Building a Second Brain – Youtube link
  • “I want to lose every debate” – The mindset to learn here is gold. – Link
  • Wonders of street view – Randomly visit any place from your browser – Link
  • [Podcast]:  Carolyn Coughlin – Becoming a good listener – link
  • How to get new ideas – by Paul Graham – link
  • An extract from a book :

When setting expectations, no matter what has been said or written, if substandard performance is accepted and no one is held accountable—if there are no consequences—that poor performance becomes the new standard. Therefore, leaders must enforce standards.

Extreme Ownership, by Jocko Willink;Leif Babin

Cheers, until next time!

Weekly Bullet #35 – Summary for the week

Here are a bunch of Technical / Non-Technical topics that I came across recently and found them very resourceful.

Technical :

  • It is December and Advent of code is here. What is Advent of code ? – link here. An old podcast on Spotify’s Engineering team geeking out every December on AOC – link here
  • [A talk – 31mins ] – Concurrency is not Parallelism – “Concurrency is about dealing with lots of things at once. Parallelism is about doing lots of things at once.”YouTube
  • Observability is all about Metrics, Events, Logs and Traces [MELT]. The four core metrics types explained in detail here
  • From P99 Performance conference for 2022 – the power of eBPF for performance insights — link here
  • [Book]: Opentelemetry is the second most active CNCF project, next to K8. Opentelemetry is the next standard for implementing vendor agnostic observability into any application. Below is a great report on the same.

Non-Technical :

  • The perks of High Documentation, Low Meetings work culture – link here
  • Richard Feynman’s way of taking pressure off yourself and doing something for the fun of it. – link here
  • [Documentary – 40mins] – The speed cubers on netflix – link here
  • Which books have made you a better thinker and problem solver? – some great recommendations here
  • An extract from a book:

How we tend to view the worst events in History? We tend to assume that the worst that has happened is the worst that can happen, and then prepare for that.
We forget that “the worst” smashed a previous understanding of what was the worst. Therefore, we need to prepare more for the extremes allowable by physics rather that what has happened until now.

The Great Mental models, Shane Parrish

Cheers, until next time!