05 Jan 2019
After I updated the gitlab to 11.6.1-ee
, ssh to git user failed.
I went on gitlab and did not found any useful information. Then I start checking my sshd services.
Jan 6 06:48:26 UltGeek-GitLabServer sshd[115940]: error: Unsafe AuthorizedKeysCommand "/opt/gitlab/embedded/service/gitlab-shell/bin/gitlab-shell-authorized-keys-check": bad ownership or modes for directory /opt/gitlab/embedded/service/gitlab-shell/bin
Jan 6 06:50:15 UltGeek-GitLabServer sshd[116182]: error: Unsafe AuthorizedKeysCommand "/opt/gitlab/embedded/service/gitlab-shell/bin/gitlab-shell-authorized-keys-check": bad ownership or modes for directory /opt
From the man page
AuthorizedKeysCommand
Specifies a program to be used to look up the user's public keys. The program must be owned by root, not writable by group or others and specified by an absolute path. Arguments to AuthorizedKeysCommand accept the tokens described in the TOKENS section. If no arguments are specified then the username of the target user is used.
The program should produce on standard output zero or more lines of authorized_keys output (see AUTHORIZED_KEYS in sshd(8)). If a key supplied by AuthorizedKeysCommand does not successfully authenticate and authorize the user then public key authentication continues using the usual AuthorizedKeysFile files. By default, no AuthorizedKeysCommand is run.
It turns out to be the wrong onwership or modes for the directory after I updated it. Then I update all the relative directory including /opt
because ssh panic when any of the directory is writable for other group or user with chmod 0755 dir
.
Finally, it is back to working agagin. I do not sure if this makes any sense but it indeed get a better security.
03 Jan 2019
There will be many connect remain if we unexpectedly disconnected from the server. It remains inactive but still shows up when you type w
command.
How to drop an ssh connection when unexpectedly disconnect from the server?
we can command to solve the problem
ps -ef | grep sshd | grep -v root | grep -v 12345 | grep -v grep | awk '{print "sudo kill -9", $2}' |sh
ps -of
displays all the process. Then we want to display only sshd session and drop your own ssh session. and then kill it with sh command.
just to remember to replace 12345 with your pid. It won’t kill any root user session.
If you are in linux you can try pkill -o -u YOURUSERNAME sshd
to kill oldest SSH session.
Resource
29 Dec 2018
Concept
Basic neural network is pretty simple way to build and let computer learn from stuffs. It can actually become pretty smart after some time. However, it is hard to let it work with things like images.
We have Convolutional Neural Network to make it easier.
Convolutional Neural Network
The basic concept are the same, but there are pre-processing for the image data.
1. Convolutional Operation
It takes the input image and apply Feature Detector via multiply each valuable value so respect to value to genertate Feature Map.
We will create multiple feature map with different Feature Detector to obtain first convolutional layer.
1b. ReLU Layer
Then we apply Rectifier Function to the Feature Map to break the linear because the image is not linear.
Detail: Delving Deep into Rectifiers
2. Max Pooling
We then reduce the size of the Feature by doing a Max Pooling to generate a small Pooled Feature Map. It reducing the number of parameters going in to the network. Also, it is harder to get overfitting.
3. Flattening
It basically take the numbers row by row and put it in to long column. Then it can fit in to neural network for processing.
4. Full Connection
Rether than regular neural network, we have a fully connection network. It means the node is connected to all other nodes perivous layer.
Softmax & Cross-Entropy
Softmax
The result does not actually add up to one. so we apply a soft max to calculate the prectage.
\[\begin{align*}
f_{i}(z)=\frac{e^{z_{j}}}{\sum_{k}e^{z_{k}}}
\end{align*}\]
Cross-Entropy
Loss function.
\[\begin{align*}
L_{i}=-\log{\frac{e^{f_{yi}}}{\sum_{j}e^{f_{j}}}}
\end{align*}\]
Repesentation, is easier to calculate.
\[\begin{align*}
H(p,q)=-\sum p(x) \log q(x)
\end{align*}\]
24 Dec 2018
Concept
- s - state
- a - action
- R - reward
- \(\gamma\) - Discount
Let AI know what they can do, and just let them know the things they can do.
such as a maze, tell AI agent to move around. Give a reward (Reinforcement Learning) to the agent.
How does agent remember to go out the maze? It distinguishes the path by giving it a value. One example is increasing the value of the path. However, we cannot assign the same value to all the action we take. Then we will introduce the Bellman equation.
Bellman Equation
\(\begin{align*}
V(s)=\underset{\scriptstyle\text{a}}{max}(R(s,a) + \gamma V(s'))
\end{align*}\)
- s = current state
- \(s'\) = next state
- R is reward while the game.
calculate the max reward, take the optimal state. \(\gamma\) eliminate the same value in the path. Inspire the agent to go on the right path.
It basically builds a map.
Additional: Original Paper
Markov Decision Process
Markov Decision process is a mathematical framework that helps decide thing with some random and under control.
Using Bellman Equation,
\(V(s')\) becomes the expected value of all possible action. Does not know which state we are going into.
\(\begin{align*}
V(s)=\underset{\scriptstyle\text{a}}{max}(R(s,a) + \gamma \sum_{s'} P(s, a, s')V(s'))
\end{align*}\)
p is the probability that the next step we take.
It gives probability to a different movement to get close to the destination.
Markov Property
A stochastic process has the Markov property if the conditional probability distribution of future states of the process (conditional on both past and present states) depends only upon the present state, not on the sequence of events that preceded it. A process with this property is called a Markov process. (From wikipedia)
Interesting paper: A survey of applications of Markov decision processes
Living Penalty
When we have that calculation, we will have a better result. However, how do we encourage them to do it faster?
We can use punishment. For each movement, there will be a penalty. That way, it will try to go faster with less action.
Q-Learning Intuition
What is Q
Replace Q instead of V.
\(\begin{align*}
V(s)&=\underset{\scriptstyle\text{a}}{max}(Q(s,a))\\
Q(s, a)&=R(s,a)+\gamma\sum_{s'} (P(s, a, s')V(s'))\\
Q(s, a)&=R(s,a)+\gamma\sum_{s'} (P(s, a, s')\underset{a'}{max}(Q(s', a')))
\end{align*}\)
V takes the maximum value. Q leaning takes all the possible action.
In short: \(Q(s, a)=R(s,a)+\gamma \:\underset{a'}{max}(Q(s', a')))\)
Detail: Markov Decision Processes: Concepts and Algorithms
Temporal Difference
\(\begin{align*}
TD(a, s) = R(s,a)+\gamma\:\underset{a'}{max}(Q(s', a'))-Q(s,a)
\end{align*}\)
different in time. \(Q(s, a)\) action previously, new \(Q(s',a')\) now.
Want to change small amount
\(\begin{align*}
TD(a, s) = R(s,a)+\gamma\:\underset{a'}{max}(Q(s', a'))-Q_{t-1}(s,a)\\
Q_{t}(s,a)=Q_{t-1}(s,a)+\alpha TD_{t}(a,s)
\end{align*}\)
\(\alpha\) is the learning rate.
Get TD() as close as possible to 0. Come up the optimal action.
\(\begin{align*}
Q_{t}(s,a)=Q_{t-1}(s,a)+\alpha (R(s,a)+\gamma\:\underset{a'}{max}(Q(s', a'))-Q_{t-1}(s,a))
\end{align*}\)
Additional Reading: Learning to predict by the methods of temporal differences
Deep Q-Learning Intuition
Simple Q learning is no longer suited for the complicated environment. We can use deep learning to calculate Q. Therefore, it is not limited to one simple rule and formula.
The neural network will predict the Q value.
How to adapt the TD in Q value. calculate error/loss using bp to update the weight.
Resource: Simple Reinforcement Learning with Tensorflow Part 4: Deep Q-Networks and Beyond
Experience Replay
In some environment, there is a different condition that will help AI to learn. Such as a self-driven car, it learns how to drive a car. Each movement is necessary to go through the network. However, it cannot learn will by one piece. Therefore, it will remember a sequence of action as a batch. When it has enough information, it started to learn by viewing the batch. It can enchant the learning process. It will reduce the number of time to practice.
Action Selection Policies
When the network output the percentage of an action. How does the machine choose which action to take? There are many action selection policies. One example is “softmax” decide the action we take.
In exploration, it will take as probability how often the action will take. Just randomly selecting the action.
Resource: Adaptive ε-greedy Exploration in Reinforcement Learning Based on Value Differences
18 Dec 2018
Bit torrent is currently pretty popular way to download, besides this, there is PT as well.
the advantage using bit torrent:
- It is using peer-to-peer networking to do the file serving.
- It contains the file information that checks if the file is corrupted.
- It encourages people to upload their file, then people can download faster and share the portion that other client doesn’t have.
- Torrent is open-sourced.
there will be a tracking server that keeps tracks all the people who downloaded the file, and they can make the file download faster when peer work together.