11 Sep 2021
|
DDIA
Consistency Guarantees
Most replicated databases provide at least eventual consistency - if you stop writing to the database and wait for some unspecified length of time, then eventually all read requests will return the same value. Also convergence as we expect all replicas to eventually converge to the same value.
Weak, because it says nothing about when.
Strong consistency deosn’t come for free, systems with stronger guarantees may have worse performance or be less fault-tolerent than the systems with weaker guarantees.
Transaction isolation is primarily about avoiding race conditions due to concurrently executing transactions, whereas distributed consistency is mostly about coordinating the state of replicas in the face of delays and faults.
This chapter starts from one of the strongest consistency models in common use, linearizability, then examine the issue of ordering events in a distributed system particularly around causality and total ordering. Finally, we’ll explore how to atomically commit a distributed transaction.
24 Aug 2021
|
DDIA
Faults and Partial Failures
In distributed systems, there can be partial failure - some parts of the system that are broken in some unpredictable way, even though other parts of the system are working fine. This nondeterminism and possibility of partial failures is what makes distributed sys‐ tems hard to work with.
A supercomputer handles faults by simply stop the entire cluster workload. A job typically checkpoints its state to duarable storage from time to time. After the faulty node is repaired, the job resumes from the last checkpoint.
Cloud computing is diffrent. Many internet related applications are online, so it is unrealistic to stop the system and repair. Super computers are typically made from specialized hardware, where each node is quite reliable, and nodes communicate through shared memory and remote direct memory access. In geographically distributed deployment, communication most likely goes over the internet which is slow and unreliable compared to local networks. We need to build a reliable system from unreliable components.
It is important to consider a wide rage of possible faults and to artificially create such situations in your test environment to see what happens. In distributed systems, suspicion, pessimism, and paranoia pay off.
24 Apr 2021
|
arrays
The typical algorithm using fast and slow pointers is LeetCode 141.
Given head, the head of a linked list, determine if the linked list has a cycle in it.
Define two pointers start from head with different speed, the fast pointer move 2 steps each time while the slow pointer move 1 step each time. If there’s no cycle in the LinkedList, the fast pointer will reach a null, then return. If there is a cycle, the faster and slow pointer will finally meet.
Why we define fast speed to 2 steps and slow to 1 step
06 May 2020
|
flink
What’s Apache Flink
Apache Flink is a framework and distributed processing engine for stateful computations over unbounded and bounded data streams. Flink has been designed to run in all common cluster environments, perform computations at in-memory speed and at any scale.
=> To see Flink as a streaming framework.
A Brief Introduction to Streaming
Some vivid paragraphs are referrenced from this great article.
Processing data in a streaming fashion becomes more and more popular over the more “traditional” way of batch-processing big data sets available as a whole. The focus shifted in the industry: it’s no longer that important how big is your data, it’s much more important how fast you can analyse it and gain insights. That’s why some people are now talking about fast data instead of the now old-school big data.
04 Mar 2020
|
java
Since Java8, CompletableFuture is introduced to provide flexible chains of non-blocking operators. CompletableFuture is an implementation of both Future and CompletionStage. Let's see how CompletionStages are chained together and get things done.
Can you tell the print result of the following code sample?
04 Dec 2019
|
Haskell
##pattern matching
The sequence in pattern matching is important, patterns will be iterated from top to bottom.
sayMe :: (Integral a) => a -> String
sayMe 1 = "One"
sayMe 2 = "Two"
sayMe 3 = "Three"
sayMe n = "Not between 1 - 3"
*Main> sayMe 3
"Three"
*Main> sayMe 5
"Not between 1 - 3"
If you reverse the sequence in the pattern matching, the result will always be “Not between 1 - 3” because the first pattern catches all inputs.
If there’s no pattern matched, the function crashes, thus we should always add a catch-all pattern in the end. This looks like Java’s switch-case syntax with a default switch, the difference is , Java won’t crash when case matches nothing.
01 Nov 2019
|
Kafka
As I accidentally ran into the issues related to Kafka assignment algorithm when adding new consumers to a certain consumer group. I checked Kafka’s documentation and here logging my understandings here, hope to help other guys understand the algorithm and choose the right one for your application (or write your own).
29 Oct 2019
|
Haskell
Interesting static type system in Haskell: has type reference and can infer the type on its own. Not only for explicit types (capitalized type like Char), but also complex types like list, tuple..
Prelude> :t 'a'
'a' :: Char
Prelude> :t "Hello"
"Hello" :: [Char]
:t (1, "abc")
(1, "abc") :: Num a => (a, [Char])