Meijia Lyu's Blog Always keep mind sharp

Elastic Search Basics

|

This is the key notes when I was exploring Elastic Search.

Basics of Haskell Learning I

|

『DDIA』CH11 - Streaming Processing

|

I’ve made a sharing in combination of this chaptor and the first 2 chapters in book . I uploaded the sharing PDF with removal of detailed code of eBay's projects.

Presentation Document

How to leverage CompletableFuture to reach non-blocking service dependent system?

|

For a typical risk evaluation service system, it always depends on several remote services to enrich user or transaction data in order to get a full picture of risk evaluation data. Here’s a scenery that how our service works:

When an evaluation request comes:

  1. purchase service is called to get a picture of user purchase; – A
  2. instrument service is called to get full data of user payment instrument; – B
  3. blacklist service is called to check if the payment instrument is blacklisted; – C
  4. blacklist service is called to check if the user purchase shipping address phone is blacklisted; – D

In this case, we get a DAG graph of execution: request -> A –> B –> C –> D

Flink Custom Stream Join

|

In this notes, I’ll instroduce several Window experiments in Flink, then show my POC sample code on how to solve an two stream join with a dense input and a sparse one. The following experiments are based on SocketWordCount job, with text as input stream.

『DDIA』- CH07: Transactions Reading Notes

|

Created to simplify the programming model for applications accessing a database. ACID: Atomicity, Consistency, Isolation, Durability.<\br> BASE: Basically Available, Soft state, Eventual consistency.

Atomicity

In general: something cannot be broken down into smaller parts. Multi-threaded programming: the system can only be in the state it was before the operation or after the operation, not something in between.

『DDIA』- CH06: Partition Reading Notes

|

Partitioning and Replication Combination

Partitioning of Key-Value Data

  • Skewed

1. Partitioning by Key Range

  • Hot spot, eg. timestamp

2. Partitioning by Hash of Key

  • The hash function need not be cryptographocally strong, but need to have same hash value for the same key in different processes. [REF reading: Java’s hashCode is not safe for distributed systems] (http://martin.kleppmann.com/2012/06/18/java-hashcode-unsafe-for-distributed-systems.html)
  • Consistent hashing

『DDIA』- CH05: Replication Reading Notes

|

This is the beginnign of Part 2, and will start to talk about distributed data storage.</br>

Why we need distributed database?</br> Scalability, fault tolerance / high availability, latency.

Scaling to higher load

  • Shared-memory architecture: all components can be treated as a single machine (vertical scaling / scaling up)
    • Problem:cost grows faster than linearly
  • Shared-disk architecture: independent CPUs and RAM but stores data on an array of disks connected via a fast network
    • Problem: contention and overhead of locking limit the scalability

Contrast:

  • Shared-nothing architecture: independent nodes coordinates at the software level using a conventional network (horizontal scaling / scaling out)
    • requires the most caution from the application developers

『DDIA』- CH04 Reading Notes

|

Encoding and Evolution

  • encoding: in-memory objects -> sequence of bytes
  • decoding: reversed

Language specified formats such as Java’s Serializable get lots of defects in encoding and decoding.

  • Not compatible with different languages
  • Arbitrary classes cause security problems
  • Inconvenience of versioning data
  • Efficiency

工作中的设计模式(Draft

|

最近刚好在通过Sonar审视自己项目的代码,解决了一些比较明显的Vulnerabilities之后,发现项目中有许多subclass都出现了因为@Inject一些组件,以及初始化,注册变量等导致Sonar报出大量Duplicate code smell。虽然可以通过设置Sonar rule来绕过这些可能不必要的警告,但是我决定先看看有没有更优雅的方案。

以下是目前代码的结构描述: