CGABC

When SLAM meets XR and Robotics

2022-12-23T16:00:00.000Z

When SLAM meets XR

When SLAM meets Robotics

GPG Overview

2022-12-11T16:00:00.000Z

Overview

PGP目前支持的算法

非对称算法: RSA, ELG, DSA, ECDH, ECDSA, EDDSA
对称算法: IDEA, 3DES, CAST5, BLOWFISH, AES, AES192, AES256, TWOFISH, CAMELLIA128, CAMELLIA192, CAMELLIA256
哈希算法: SHA1, RIPEMD160, SHA256, SHA384, SHA512, SHA224
压缩算法: Uncompressed, ZIP, ZLIB, BZIP2

除非量子计算机落地，目前来说2048位的RSA加密是不可破解的。

GPG

生成密钥

gpg --gen-key

# or

gpg --full-generate-key

output

gpg (GnuPG) 2.2.4; Copyright (C) 2017 Free Software Foundation, Inc.
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.

Please select what kind of key you want:
   (1) RSA and RSA (default)
   (2) DSA and Elgamal
   (3) DSA (sign only)
   (4) RSA (sign only)
Your selection? 
RSA keys may be between 1024 and 4096 bits long.
What keysize do you want? (3072) 
Requested keysize is 3072 bits
Please specify how long the key should be valid.
         0 = key does not expire
        = key expires in n days
      w = key expires in n weeks
      m = key expires in n months
      y = key expires in n years
Key is valid for? (0) 
Key does not expire at all
Is this correct? (y/N) y

GnuPG needs to construct a user ID to identify your key.

Real name: Gavin Gao
Email address: cggos@outlook.com
Comment: 
You selected this USER-ID:
    "Gavin Gao "

Change (N)ame, (C)omment, (E)mail or (O)kay/(Q)uit? o
We need to generate a lot of random bytes. It is a good idea to perform
some other action (type on the keyboard, move the mouse, utilize the
disks) during the prime generation; this gives the random number
generator a better chance to gain enough entropy.
We need to generate a lot of random bytes. It is a good idea to perform
some other action (type on the keyboard, move the mouse, utilize the
disks) during the prime generation; this gives the random number
generator a better chance to gain enough entropy.
gpg: key 94FB606ACFB828F6 marked as ultimately trusted
gpg: directory '/home/cg/.gnupg/openpgp-revocs.d' created
gpg: revocation certificate stored as '/home/cg/.gnupg/openpgp-revocs.d/D06142ABCC08402AFCDB2FAF94FB606ACFB828F6.rev'
public and secret key created and signed.

pub   rsa3072 2022-04-28 [SC]
      D06142ABCC08402AFCDB2FAF94FB606ACFB828F6
uid                      Gavin Gao 
sub   rsa3072 2022-04-28 [E]

其中，Key ID

1	`94FB606ACFB828F6`

私钥的密码为

1	`xxxx gpg`

生成子密钥

你日常使用应该使用子密钥，主密钥除了签发新的子密钥不要使用。

建议为不同环境，不同用途都单独生成子密钥，互不干扰。

1	`gpg --edit-key cggos@outlook.com`

output

gpg (GnuPG) 2.2.4; Copyright (C) 2017 Free Software Foundation, Inc.
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.

Secret key is available.

sec  rsa3072/94FB606ACFB828F6
     created: 2022-04-28  expires: never       usage: SC  
     trust: ultimate      validity: ultimate
ssb  rsa3072/BB0088AB554CF92D
     created: 2022-04-28  expires: never       usage: E   
[ultimate] (1). Gavin Gao 

gpg> addkey 
Please select what kind of key you want:
   (3) DSA (sign only)
   (4) RSA (sign only)
   (5) Elgamal (encrypt only)
   (6) RSA (encrypt only)
Your selection? 4
RSA keys may be between 1024 and 4096 bits long.
What keysize do you want? (3072) 
Requested keysize is 3072 bits
Please specify how long the key should be valid.
         0 = key does not expire
        = key expires in n days
      w = key expires in n weeks
      m = key expires in n months
      y = key expires in n years
Key is valid for? (0) 
Key does not expire at all
Is this correct? (y/N) y
Really create? (y/N) y
We need to generate a lot of random bytes. It is a good idea to perform
some other action (type on the keyboard, move the mouse, utilize the
disks) during the prime generation; this gives the random number
generator a better chance to gain enough entropy.

sec  rsa3072/94FB606ACFB828F6
     created: 2022-04-28  expires: never       usage: SC  
     trust: ultimate      validity: ultimate
ssb  rsa3072/BB0088AB554CF92D
     created: 2022-04-28  expires: never       usage: E   
ssb  rsa3072/3384DE02354CC62E
     created: 2022-04-28  expires: never       usage: S   
[ultimate] (1). Gavin Gao 

gpg> save

撤销证书

生成一张"撤销证书"，以备以后密钥作废时，可以请求外部的公钥服务器撤销你的公钥

1	`gpg --gen-revoke 94FB606ACFB828F6`

output

sec  rsa3072/94FB606ACFB828F6 2022-04-28 Gavin Gao 

Create a revocation certificate for this key? (y/N) y
Please select the reason for the revocation:
  0 = No reason specified
  1 = Key has been compromised
  2 = Key is superseded
  3 = Key is no longer used
  Q = Cancel
(Probably you want to select 1 here)
Your decision? 
Enter an optional description; end it with an empty line:
> 
Reason for revocation: Key has been compromised
(No description given)
Is this okay? (y/N) y
ASCII armored output forced.
-----BEGIN PGP PUBLIC KEY BLOCK-----
Comment: This is a revocation certificate

iQG2BCABCgAgFiEE0GFCq8wIQCr82y+vlPtgas+4KPYFAmJqewwCHQIACgkQlPtg
as+4KPYkDgv9FsNmqApXiu/p9y/M9pLQHgnQwmRbmjCEhZ+qpfQX2s5+zhpWQOtj
WqtL9EwM12Wld/aLsFnsjd1cU8hE5FpaPIt3slSNPIjqF5n1JKZqHkK850XKj2Z7
MMBsA6t3P7p7VSpP2oS0/d3q5bio9z37HePlp67gGYMHINsiVfHlrNBCNhhQ8K1q
yZmAORtbr0X2WS+ljG8aqqFg5dlG7WAhK/MugbKIFzdkc5Xugu5oQUgq3uogXGZg
o30/GS4a5KyTNSSWbO1vMA/tfYlhDsN+ywqqrStCGWCjO/JJk9Am6eG16zf9wyCW
PtXjc67X5WiOR+t2SWkbVcGZFJMlAdQwRF/D64qskGvap96qj8+I3U9hNaqG8W5A
5dO+vPEocDhPs0AqGzytFVmG88EyOIenhvVc8xtO9JrkFFUN0XBdsQoA162RH2tv
XH25wy3ZSzkdCXKYlQHFM7SIg6Lhfxl/j3ucueWlVciECEKnqixTw6Uq/Px/T+8h
Fv1vvK63BrJy
=BuV4
-----END PGP PUBLIC KEY BLOCK-----
Revocation certificate created.

Please move it to a medium which you can hide away; if Mallory gets
access to this certificate he can use it to make your key unusable.
It is smart to print this certificate and store it away, just in case
your media become unreadable.  But have some caution:  The print system of
your machine might store the data and make it available to others!

列出本地密钥

1
2
3

gpg --list-keys
# or
gpg --list-secret-keys

output

/home/cg/.gnupg/pubring.kbx
---------------------------
pub   rsa3072 2022-04-28 [SC]
      D06142ABCC08402AFCDB2FAF94FB606ACFB828F6
uid           [ultimate] Gavin Gao 
sub   rsa3072 2022-04-28 [E]

1
2
3

gpg --list-secret-keys --fingerprint --keyid-format long

gpg --fingerprint -K --keyid-format long

/home/cg/.gnupg/pubring.kbx
---------------------------
sec   rsa3072/94FB606ACFB828F6 2022-04-28 [SC]
      Key fingerprint = D061 42AB CC08 402A FCDB  2FAF 94FB 606A CFB8 28F6
uid                 [ultimate] Gavin Gao 
ssb   rsa3072/BB0088AB554CF92D 2022-04-28 [E]
ssb   rsa3072/3384DE02354CC62E 2022-04-28 [S]

导出密钥

public key

1
2
3

gpg -ao public-key.txt --export [用户ID]

gpg --armor --output public-key.txt --export 94FB606ACFB828F6

Private Key

你日常使用应该使用子密钥，主密钥除了签发新的子密钥不要使用。

建议为不同环境，不同用途都单独生成子密钥，互不干扰。

gpg --armor --output private-key.txt --export-secret-keys 94FB606ACFB828F6

# keybase
gpg --export-secret-keys -a 94FB606ACFB828F6

# 注意这里最后 要带上“!”， 不然会导出全部子密钥
gpg -ao secret-key.txt  --export-secret-key     94FB606ACFB828F6! # 导出主私钥，建议secret-key 替换为你的加密设备备份文件的路径，直接导入到设备中
gpg -ao subkey-s.txt    --export-secret-subkeys 3384DE02354CC62E!   # 导出有[S]标识、签名用子私钥
gpg -ao subkey-e.txt    --export-secret-subkeys BB0088AB554CF92D!     # 导出有[E]标识、加密用子私钥 ,这里的ID替换为你的子密钥ID

# 别忘了同时将你刚刚生成的撤销凭证也备份起来

删除本地密钥

gpg --delete-secret-keys [用户ID]  # 删除私钥， UID 也可以替换成子密钥ID, 主密钥Key ID

gpg --delete-keys [用户ID]      # 删除公钥
gpg --delete-key [用户ID]

# 如果想全部删除推荐直接删文件夹,即删除 $HOME/.gnupg

gpg (GnuPG) 2.2.4; Copyright (C) 2017 Free Software Foundation, Inc.
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.

sec  rsa3072/94FB606ACFB828F6 2022-04-28 Gavin Gao 

Delete this key from the keyring? (y/N) y
This is a secret key! - really delete? (y/N) y

pub  rsa3072/94FB606ACFB828F6 2022-04-28 Gavin Gao 

Delete this key from the keyring? (y/N) y

公钥服务器

keyserver

配置默认

# ~/.gnupg/gpg.conf

keyserver hkps://keys.openpgp.org

keyid-format 0xlong
with-fingerprint

上传

1	`gpg --keyserver hkps://keys.openpgp.org --send-keys 94FB606ACFB828F6`

验证邮箱

浏览器搜索查询

搜索

gpg --keyserver hkps://keys.openpgp.org --search-keys 94FB606ACFB828F6

# or

gpg --keyserver hkps://keys.openpgp.org --search-keys cggos@outlook.com

output

gpg: data source: https://keys.openpgp.org:443
(1)Gavin Gao com>
  3072 bit RSA key 94FB606ACFB828F6, created: 2022-04-28
Keys 1-1 of 1 for "94FB606ACFB828F6".  Enter number(s), N)ext, or Q)uit > N

公钥指纹

由于公钥服务器没有检查机制，任何人都可以用你的名义上传公钥，所以没有办法保证服务器上的公钥的可靠性。通常，你可以在网站上公布一个公钥指纹，让其他人核对下载到的公钥是否为真。fingerprint参数生成公钥指纹。

1
2
3

gpg --fingerprint 94FB606ACFB828F6

gpg --fingerprint -K --keyid-format long

output

pub   rsa3072 2022-04-28 [SC]
      D061 42AB CC08 402A FCDB  2FAF 94FB 606A CFB8 28F6
uid           [ultimate] Gavin Gao 
sub   rsa3072 2022-04-28 [E]

导入密钥

从文件import

1
2
3

gpg --import [密钥文件] # 刚刚备份的子密钥文件， 或者其他人的公钥

gpg --import subkey-s.txt

gpg: key 94FB606ACFB828F6: "Gavin Gao " not changed
gpg: To migrate 'secring.gpg', with each smartcard, run: gpg --card-status
gpg: key 94FB606ACFB828F6: secret key imported
gpg: Total number processed: 1
gpg:              unchanged: 1
gpg:       secret keys read: 1
gpg:   secret keys imported: 1

从公钥服务器上获取公钥:

1	`gpg --keyserver keys.openpgp.org --recv-keys 94FB606ACFB828F6`

output

1
2
3

gpg: key 94FB606ACFB828F6: public key "Gavin Gao " imported
gpg: Total number processed: 1
gpg:               imported: 1

Keybase

1	`keybase pgp select`

You are selecting a PGP key from your local GnuPG keychain, and
will publish a statement signed with this key to make it part of
your Keybase.io identity.

Note that GnuPG will prompt you to perform this signature.

You can also import the secret key to *local*, *encrypted* Keybase
keyring, enabling decryption and signing with the Keybase client.
To do that, use "--import" flag.

Learn more: keybase pgp help select

#    Algo    Key Id             Created   UserId
=    ====    ======             =======   ======
1    3072R   94FB606ACFB828F6             Gavin Gao 
Choose a key: 1
▶ INFO Generated new PGP key:
▶ INFO   user: Gavin Gao 
▶ INFO   3072-bit RSA key, ID 94FB606ACFB828F6, created 2022-04-28

应用

文件验证

签名

1	`gpg --detach-sign demo.txt`

验证

1	`gpg --verify demo.txt.sig demo.txt`

gpg: Signature made Thu 28 Apr 2022 08:14:27 PM CST
gpg:                using RSA key C8BA9D0647339A178B7545F03384DE02354CC62E
gpg: Good signature from "Gavin Gao " [unknown]
gpg: WARNING: This key is not certified with a trusted signature!
gpg:          There is no indication that the signature belongs to the owner.
Primary key fingerprint: D061 42AB CC08 402A FCDB  2FAF 94FB 606A CFB8 28F6
     Subkey fingerprint: C8BA 9D06 4733 9A17 8B75  45F0 3384 DE02 354C C62E

信任

gpg --edit-key cggos@outlook.com

gpg (GnuPG) 2.2.4; Copyright (C) 2017 Free Software Foundation, Inc.
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.

pub  rsa3072/94FB606ACFB828F6
     created: 2022-04-28  expires: never       usage: SC  
     trust: unknown       validity: unknown
sub  rsa3072/3384DE02354CC62E
     created: 2022-04-28  expires: never       usage: S   
sub  rsa3072/BB0088AB554CF92D
     created: 2022-04-28  expires: never       usage: E   
[ unknown] (1). Gavin Gao 

gpg> trust 
pub  rsa3072/94FB606ACFB828F6
     created: 2022-04-28  expires: never       usage: SC  
     trust: unknown       validity: unknown
sub  rsa3072/3384DE02354CC62E
     created: 2022-04-28  expires: never       usage: S   
sub  rsa3072/BB0088AB554CF92D
     created: 2022-04-28  expires: never       usage: E   
[ unknown] (1). Gavin Gao 

Please decide how far you trust this user to correctly verify other users' keys
(by looking at passports, checking fingerprints from different sources, etc.)

  1 = I don't know or won't say
  2 = I do NOT trust
  3 = I trust marginally
  4 = I trust fully
  5 = I trust ultimately
  m = back to the main menu

Your decision? 5
Do you really want to set this key to ultimate trust? (y/N) y

pub  rsa3072/94FB606ACFB828F6
     created: 2022-04-28  expires: never       usage: SC  
     trust: ultimate      validity: unknown
sub  rsa3072/3384DE02354CC62E
     created: 2022-04-28  expires: never       usage: S   
sub  rsa3072/BB0088AB554CF92D
     created: 2022-04-28  expires: never       usage: E   
[ unknown] (1). Gavin Gao 
Please note that the shown key validity is not necessarily correct
unless you restart the program.

gpg> 
gpg: signal Interrupt caught ... exiting

git

https://docs.github.com/cn/authentication/managing-commit-signature-verification

****能用来~~放在博客简介里作为身份的象征~~ 增加联系你的安全方式**

用来代替SSH

涌有了自己pgp key之后，就可以用 gpg-agent 来代替 OpenSSH Agent来进行 SSH操作了。不过替换了之后并不会增加SSH的安全性，额，折腾精神不死嘛。

硬要说好处的话，大概就可以更方便地使用Yubikey(一种硬件加密智能卡)来SSH。

apt-get

apt-get update 或者aptitude update出现以下错误：

The following signatures couldn't be verified because the public key is not available: : NO_PUBKEY B5B7720097BB3B58

解决方法：

# 从任何一个key server获得缺失的公钥B5B7720097BB3B58。
gpg --keyserver subkeys.pgp.net --recv-keys B5B7720097BB3B58

# 导入公钥B5B7720097BB3B58。
gpg -a --export B5B7720097BB3B58 | sudo apt-key add -

curl -s https://raw.githubusercontent.com/ros/rosdistro/master/ros.asc | sudo apt-key add -

移动机器人常见底盘及其运动学

2022-11-21T16:00:00.000Z

Overview ^[1]

通常移动机器人依赖电机驱动车轮实现行走功能。机器人底盘结构不同，其运动学也完全不同。根据不同类型车轮，常见的底盘结构差速运动模型、滑移运动模型、阿克曼运动模型、全向轮运动模型等等。

下图中，（a）双轮差速式机器人，（b）阿克曼式机器人，（c）四轮驱动机器人，（d）双履带式机器人，（e）麦克纳姆轮全向机器人，（f）全向轮全向机器人，（g）四轮驱动四轮转向机器人 ^[2]

ROS 中运动学分析为正解（Forward kinematics）和逆解（Inverse Kinematics）两种。

正解或正运动学模型（forward kinematic model）：是将获得的机器人底盘速度指令 /cmd_vel 转化为每个车轮的实际速度。
逆解或逆运动学模型（inverse kinematic model）：是根据电机编码器获得的每个车轮速度计算出机器人底盘速度，从而实现航迹推算。

两轮差速运动模型 (Differential Drive robot)

双轮差速式机器人的两个动力轮设置在底盘左右两侧，两轮速度可独立控制，通过给定不同速度实现底盘的直线和转向控制。为保持平衡，底盘一般会配有一到两个辅助支撑的万向轮，从而形成三轮或四轮的轮系结构。

ROS自带的DWA路径规划算法比较适合 ^[3]，why？

Forward Kinematic Model ^[3]

一个驱动轮的速度

车体中心的速度

\[v = \frac{v_l + v_r}{2}\]

求得小车的近似瞬间速度 \(v\) 后，以世界坐标系为原点，对 \(v\) 进行积分，即可得到机器人在世界坐标系中的位姿 \(P\)

\[\dot P = \begin{bmatrix} c & -s & 0 \\ s & c & 0 \\ 0 & 0 & 1 \end{bmatrix}\begin{bmatrix} v_x \\ v_y \\ v_\theta \end{bmatrix}\]

\[P_1 = P_0 + \dot P \cdot \Delta t\]

//compute odometry in a typical way given the velocities of the robot
double dt = (current_time - last_time).toSec();

double delta_x = (vx * cos(th) - vy * sin(th)) * dt;
double delta_y = (vx * sin(th) + vy * cos(th)) * dt;
double delta_th = vth * dt;

x += delta_x;
y += delta_y;
th += delta_th;

Backward Kinematic Model

四轮阿克曼底盘 (Four-wheeled Ackerman robot)

阿克曼式机器人为四轮式，它的原理与汽车相似，由两后轮作为驱动轮提供动力，由两前轮作为转向轮控制方向，且两前轮的转角通过阿克曼转向机构关联。由于采用了与汽车相似的构造，阿克曼式机器人操纵性与汽车类似。

四轮滑移底盘（Four-wheel sliding robot）

四轮驱动机器人的四个直行轮大小相同、独立驱动且前后、左右对称布置，依靠左右侧直行轮的速度差实现转向。在转弯过程中，四轮驱动机器人是靠滑动摩擦实现的，因此会对直行轮及地面造成一定的磨损。因为存在严重的滑移情况，四轮驱动机器人难以精确控制。

四轮驱动四轮转向机器人

四轮驱动四轮转向机器人（4WD-4WS）相当于有8个电机在控制其运动，可轻松实现机器人的全向运动，具有机构简单、行动灵活、效率高等特点，在室外非结构化场景下具有较强的自适应能力。然而，随着电机数量的增加，对控制的精确性、同步性提出了更高的要求，在一定程度上加大了控制难度。

全向移动底盘（omnidirectional wheel robot）

这类机器人相对比较特殊，车轮采用了 麦克纳姆轮或全向轮，按照一定的规律控制车轮转动，则可以实现前、后、左、右四个方向的全向移动，比起非全向移动机器人，其灵活性更好，能够在狭窄的区域运动。但由于受到麦克纳姆轮或全向轮的限制，该类机器人的承载能力不大。另外，全向移动机器人的各个车轮产生的力会相互抵消一部分，因此同样转矩产生的净推力效率较低，综合效率不如差速式机器人。

双履带式

双履带式机器人底盘左右两侧各配置一套履带移动机构。每套履带移动机构由轮系、悬挂系统和履带组成。轮系包含若干驱动轮、支重轮、导向轮、托带轮；悬挂系统一般采用克里斯蒂悬挂，以保障越障性能良好；履带一般由强度高、重量轻、模量高、无收缩的复合材料制成。双履带式机器人的越障性能优良，在室外复杂环境中有较多应用。

Reference

On-Manifold Optimization: Local Parameterization

2022-09-04T11:30:00.000Z

Overview

Manifold Space vs Tangent Space

Jacobian w.r.t Error State

Jacobian w.r.t Error State vs True State

According ^[1] 2.4,

The idea is that for a \(x \in N\) the function \(g(\delta) := f (x \boxplus \delta)\) behaves locally in \(0\) like \(f\) does in \(x\). In particular \(\|f(x)\|^2\) has a minimum in \(x\) if and only if \(\|g(\delta)\|^2\) has a minimum in \(0\). Therefore finding a local optimum of \(g\), \(\delta = \arg \min_{\delta} \|g(\delta)\|^2\) implies \(x \boxplus \delta = \arg \min_{\xi} \|f(\xi)\|^2\).

\[f(x \boxplus \delta)=f(x)+J_x \delta+\mathcal{O}\left(\|\delta\|^2\right)\]

where

\[J = \left. \frac{\partial f(x \boxplus \delta)}{\partial \delta} \right|_{\delta=0}\quad \longleftrightarrow \quadJ = \left. \frac{\partial f(x)}{\partial x} \right|_{x}\]

ESKF ^[2] 6.1.1: Jacobian computation

\[\left.\mathbf{H} \triangleq \frac{\partial h}{\partial \delta \mathbf{x}}\right|_{\mathbf{x}}=\left.\left.\frac{\partial h}{\partial \mathbf{x}_t}\right|_{\mathbf{x}} \frac{\partial \mathbf{x}_t}{\partial \delta \mathbf{x}}\right|_{\mathbf{x}}=\mathbf{H}_{\mathbf{x}} \mathbf{X}_{\delta \mathbf{x}}\]

\(x_t\): true state
\(x\): normal state
\(\delta x\): error state

lifting and retraction:

\[\left.\mathbf{X}_{\delta \mathbf{x}} \triangleq \frac{\partial \mathbf{x}_t}{\partial \delta \mathbf{x}}\right|_{\mathbf{x}}=\left[\begin{array}{ccc}\mathbf{I}_6 & 0 & 0 \\0 & \mathbf{Q}_{\delta \boldsymbol{\theta}} & 0 \\0 & 0 & \mathbf{I}_9\end{array}\right]\]

the quaternion term

\[\begin{aligned}\left.\mathbf{Q}_{\delta \boldsymbol{\theta}} \triangleq \frac{\partial(\mathbf{q} \otimes \delta \mathbf{q})}{\partial \delta \boldsymbol{\theta}}\right|_{\mathbf{q}} &=\left.\left.\frac{\partial(\mathbf{q} \otimes \delta \mathbf{q})}{\partial \delta \mathbf{q}}\right|_{\mathbf{q}} \frac{\partial \delta \mathbf{q}}{\partial \delta \boldsymbol{\theta}}\right|_{\delta \hat{\boldsymbol{\theta}}=0} \\&=\left.\left.\frac{\partial\left([\mathbf{q}]_L \delta \mathbf{q}\right)}{\partial \delta \mathbf{q}}\right|_{\mathbf{q}} \frac{\partial\left[\begin{array}{c}1 \\\frac{1}{2} \delta \boldsymbol{\theta}\end{array}\right]}{\partial \delta \boldsymbol{\theta}}\right|_{\hat{\delta}=0} \\&=[\mathbf{q}]_L \frac{1}{2}\left[\begin{array}{lll}0 & 0 & 0 \\1 & 0 & 0 \\0 & 1 & 0 \\0 & 0 & 1\end{array}\right]\end{aligned}\]

Least Squares on a Manifold ^[3]

Local Parameterization in Ceres Solver ^[4] ^[5] ^[6] ^[7] ^[8]

class LocalParameterization {
 public:
  virtual ~LocalParameterization() = default;
  virtual bool Plus(const double* x,
                    const double* delta,
                    double* x_plus_delta) const = 0;
  virtual bool ComputeJacobian(const double* x, double* jacobian) const = 0;
  virtual bool MultiplyByJacobian(const double* x,
                                  const int num_rows,
                                  const double* global_matrix,
                                  double* local_matrix) const;
  virtual int GlobalSize() const = 0;
  virtual int LocalSize() const = 0;
};

Plus

Retraction

\[\boxplus(x, \Delta)=x \operatorname{Exp}(\Delta)\]

ComputeJacobian

global w.r.t local

\[J_{GL} = \frac{\partial x_G}{\partial x_L}= D_2 \boxplus(x, 0) = \left. \frac{\partial \boxplus(x, \Delta)}{\partial \Delta} \right|_{\Delta = 0}\]

参考 ^[9]

\(r\) w.r.t \(x_{L}\)

在 ceres::CostFunction 处提供 residuals 对 Manifold 上变量的导数

\[J_{rG} = \frac{\partial r}{\partial x_G}\]

则对 Tangent Space 上变量的导数

\[J_{rL} = \frac{\partial r}{\partial x_L}= \frac{\partial r}{\partial x_G} \cdot J_{GL}\]

Sub Class

QuaternionParameterization
EigenQuaternionParameterization

自定义 QuaternionParameterization

参考 ^[7]

Summary

QuaternionParameterization 的 Plus 与 ComputeJacobian 共同决定使用左扰动或使用右扰动形式

Quaternion in Eigen


Quaterniond q1(1, 2, 3, 4);           // wxyz
Quaterniond q2(Vector4d(1, 2, 3, 4)); // xyzw
Quaterniond q3(tmp_q);                // xyzw, double tmp_q[4];
q.coeffs();                           // xyzw

Quaternion in Ceres Solver

order: wxyz
Ceres Solver 中 Quaternion 是 Hamilton Quaternion，遵循 Hamilton 乘法法则
矩阵 raw memory 存储方式是 Row Major

Reference

A Framework for Sparse, Non-Linear Least Squares Problems on Manifolds ↩
Quaternion kinematics for the error-state Kalman filter, Joan Solà ↩
A Tutorial on Graph-Based SLAM ↩
http://ceres-solver.org/nnls_modeling.html#localparameterization ↩
On-Manifold Optimization Demo using Ceres Solver ↩
Matrix Manifold Local Parameterizations for Ceres Solver ↩
[ceres-solver] From QuaternionParameterization to LocalParameterization 😄 ↩
LocalParameterization子类说明：QuaternionParameterization类和EigenQuaternionParameterization类 ↩
优化库——ceres（二）深入探索ceres::Problem ↩

Mapillary API at a Glance

2022-07-23T16:00:00.000Z

Embed images ^[1]

The iframe tag below would show a 640x480px image with ID 550092599700936.

<iframe 
  src="https://www.mapillary.com/embed?image_key=550092599700936&style=classic" 
  height="480" 
  width="640"  
  frameborder="0">
iframe>

Reference

Mapillary API Documentation ↩

Observability and Inconsistency in a Nutshell

2022-07-15T16:00:00.000Z

[TOC]

Overview

What is observability ?

In control theory, observability is a measure for how well internal states of a system can be inferred by knowledge of its external outputs.

What is consistency ?

A recursive estimator is consistent when the estimation errors are zero-mean and have covariance matrix equal to that reported by the estimator.

Observability \(\longrightarrow\) Consistency

Mismatch (actual vs true) in observability \(\longrightarrow\) Inconsistency

VINS observability properties \(\longrightarrow\) estimator inconsistency

Basics

Nullspace ^[1]

Lie Derivative ^[2]

已知，光滑标量函数 \(h\) 以及光滑向量场 \(f\) 和 \(g\)

\[\begin{aligned}h(x):& \; R^n \rightarrow R \\f(x):& \; R^n \rightarrow R^n \\g(x):& \; R^n \rightarrow R^n\end{aligned}\]

行向量梯度 \(\nabla h\) 乘以向量场 \(f\)，其结果 \(L_f h\) 正好是个标量

\[L_f h = \nabla h \cdot f = \left( \frac{\partial h}{\partial x} \right)^T f\]

\(L_g L_f h\) 结果依然是个标量

\[L_g L_f h = L_g (L_f h) = \nabla(L_f h) g = \left( \frac{\partial (L_f h)}{\partial x} \right)^T g= \left( \frac{\partial \left( (\frac{\partial h}{\partial x})^T f \right)}{\partial x} \right)^T g\]

总结一下：Lie Derivative与一般的Derivative的区别是，Lie Derivative是定义在两个函数 \(h\) 和 \(f\) 之间的，它俩都是向量 \(x\) 的函数，通过共同的 \(x\) 联系起来；一般的Derivative是某个函数对 \(x\) 定义的。

Observability Analysis ^[4]

what: 控制理论中的可观察性（observability）^[3] 是指系统可以由其外部输出推断其内部状态的程度。
why: 为了能让系统不可观的维度与真实系统一致，从而提高系统精度
how: 通过计算可观性矩阵，分析其零空间的秩，来分析系统哪些状态维度可观/不可观；可观性矩阵对应系统可观测的维度，零空间对应系统不可观的维度

Unobservable DoF (Gauge Freedom) in SLAM

Mono vSLAM: 7
- 6 DoF 绝对位姿 + 尺度
Stereo vSLAM: 6
- 6 DoF 绝对位姿
Mono + IMU SLAM: 4
- 3 DoF 绝对位置 + 绝对yaw角
- roll 和 pitch 由于重力的存在而可观，尺度因子由于加速度计的存在而可观

Observability Matrix

Discrete state space equations of nonlinear systems (linearized without considering noise) is

\[\begin{cases} x_{k+1} = \Phi_k x_k \\ y_k = H_k x_k \end{cases}\]

according to the Lie derivative, the observability matrix is

\[\mathbf{\mathcal{O}} \left(\mathbf{x}^{\star}\right) =\left[ \begin{array}{c} \mathbf{H}_{1} \\ \mathbf{H}_{2} \boldsymbol{\Phi}_{2,1} \\ \vdots \\ \mathbf{H}_{k} \boldsymbol{\Phi}_{k, 1} \end{array}\right]\]

then, the unobservable dimensions of the system are

\[\text{rank}(N(\mathcal{O}))\]

Observability Matrix vs Hessian(Information) Matrix

对于SLAM系统而言（如单目VO），当我们改变状态量时，测量不变意味着损失函数不会改变，更意味着求解最小二乘时对应的信息矩阵H存在着零空间。

for the monocular VO based on optimization methods, the dimension of null space of the Hessian (Information) matrix \(H\) is 7, that is, the unobservable dimensions are

\[\text{rank}(N(H)) = 7\]

\[J^T J \Delta x = - J^T r \quad \longrightarrow \quad H \Delta x = b\]

What is the relationship between the Hessian matrix \(H\) and the observability matrix \(\mathcal{O}\) in the optimization based VO/VIO ?

paper: Observability-Based Guidance and Sensor Placement (Chapter 2 - OBSERVABILITY MEASURES) > As a note, the measurement Jacobian, \(dY\), is equivalent to the observability matrix, \(d\mathcal{O}\), evaluated at a nominal state, \(x_0\).

贺一家博士给的总结：

NEES (normalized estimation error squared)

NEES closer to 6 for VIO

Inconsistency of Estimator

a state estimator is consistent if the estimation errors (i) are zero-mean, and (ii) have covariance matrix smaller or equal to the one calculated by the filter.

Degeneracy (Insufficient Restraint) / Inconsistency in SLAM

Motion

constant acceleration
pure translation

Structure

Maintain(Solve) Consistency(Inconsistency)

open_vins #171: Consistency maintenance methods, FEJ vs Observability-constrained(OC) ones
MSCKF笔记：可观性问题
如何理解EKF中的consistency？

paper:

VINS on Wheels
- Odometry measurements
- Planar-motion constraints

FEJ (First-Estimate Jacobians)

paper: A First-Estimates Jacobian EKF for Improving SLAM Consistency
estimation from the first time
to ensure that the state transition and Jacobian matrices are evaluated at correct linearization points such that the above observability analysis will hold true

FEJ 算法：不同残差对同一个状态求雅克比时，线性化点必须一致，这样就能避免零空间退化而使得不可观变量变得可观。

app:

OKVIS
DSO
- DSO - First Estimates Jacobian
ElasticFusion 改进版 (IROS2107, Stefan Leutenegger)
OpenVINS
- First-Estimate Jacobian Estimators (OpenVINS)
- OpenVINS (7)- 能观一致性分析和FEJ

ref:

FEJ2

TODO

Observability Constraint (OC)-VINS

App: OC-MSC-KF

MSCKF-VIO (S-MSCKF):

Modification of the State Transition Matrix \(\Phi\)

// Modify the transition matrix to make the observability matrix have proper null space
Matrix3d R_kk_1 = quaternionToRotation(imu_state.orientation_null);
Phi.block<3, 3>(0, 0) = quaternionToRotation(imu_state.orientation) * R_kk_1.transpose();

Vector3d u = R_kk_1 * IMUState::gravity;
RowVector3d s = (u.transpose() * u).inverse() * u.transpose();

Matrix3d A1 = Phi.block<3, 3>(6, 0);
Vector3d w1 = skewSymmetric(imu_state.velocity_null - imu_state.velocity) * IMUState::gravity;
Phi.block<3, 3>(6, 0) = A1 - (A1 * u - w1) * s;

Matrix3d A2 = Phi.block<3, 3>(12, 0);
Vector3d w2 = skewSymmetric(dtime * imu_state.velocity_null + imu_state.position_null - imu_state.position) * IMUState::gravity;
Phi.block<3, 3>(12, 0) = A2 - (A2 * u - w2) * s;

Modification of the Measurement Jacobian \(H\)

// Modifty the measurement Jacobian to ensure observability constrain. Ref: OC-VINS
Matrix<double, 4, 6> A = H_x;
Matrix<double, 6, 1> u = Matrix<double, 6, 1>::Zero();
u.block<3, 1>(0, 0) = quaternionToRotation(cam_state.orientation_null) * IMUState::gravity;
u.block<3, 1>(3, 0) = skewSymmetric(p_w - cam_state.position_null) * IMUState::gravity;
H_x = A - A * u * (u.transpose() * u).inverse() * u.transpose();
H_f = -H_x.block<4, 3>(0, 3);

Gauge Freedom Handling

It is well known that visual-inertial systems have four degrees of freedom that are not observable: the global position and the rotation around gravity. These unobservable degrees of freedom (called gauge freedom) have to be handled properly in visual-inertial state estimation to obtain a unique state estimate.

paper: On the Comparison of Gauge Freedom Handling in Optimization-based Visual-Inertial State Estimation
code: Covariance Transformation for Visual-Inertial Systems

H有正确的零空间，比如，对于单目VO，rank(N(H)) = 7，则H为奇异矩阵，那么增量方程始终存在病态或求解不稳定问题；通过处理规范自由度解决。

In optimization-based methods, three approaches are usually used:

Gauge Fixation: fixing the initial state,
Gauge Prior: adding a prior to the initial state,
Free Gauge: allowing the parameters to evolve freely during optimization.

ref:

FAQ

为什么位置不可观，对于单目VO，第一帧不是固定住了吗？

Reference

PyTorch on Ubuntu 18.04

2022-06-24T16:00:00.000Z

[TOC]

Install PyTorch locally

https://pytorch.org/get-started/locally/

Conda (Python)

conda create -n torch python=3.9
conda activate torch

# select one
conda install pytorch -c soumith 
conda install pytorch torchvision cuda80 -c soumith
conda install pytorch==1.0.0 torchvision==0.2.1 cuda80 -c pytorch

# PyTorch 1.10
conda install pytorch torchvision torchaudio cudatoolkit=10.2 -c pytorch

test

1 2	`python > import torch`

Pip (Python)

1
2
3

pip install torchvision==0.4.0 -f https://download.pytorch.org/whl/torch_stable.html

pip uninstall torchvision

LibTorch (C++)

Dowload the prebuilt libs

1	`set(CMAKE_PREFIX_PATH "/libtorch/share/cmake/Torch")`

Build from Source

# example
git clone --recursive -b v1.0.1 https://github.com/pytorch/pytorch

cd pytorch
mkdir build
cd build

# options
export NO_CUDA=1
    
python ../tools/build_libtorch.py

The built libtorch library is located at pytorch/torch/lib/tmp_install/ in default.

1	`set(Torch_DIR "/pytorch/torch/lib/tmp_install/share/cmake/Torch/")`

use with cmake

1
2
3

find_package(Torch REQUIRED)
set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} ${TORCH_CXX_FLAGS}")
target_link_libraries(<target> ${TORCH_LIBRARIES})

include dir

1 2	`libtorch/include libtorch/include/torch/csrc/api/include`

PyTorch 模型文件

PyTorch的模型文件一般会保存为 .pth 文件，C++接口一般读取的是 .pt 文件
.pth 文件通过 torch.jit.trace 转换后得到 .pt 文件

只保存模型参数，不保存模型结构

// 保存
torch.save(model.state_dict(), mymodel.pth) // 只保存模型权重参数，不保存模型结构

// 调用
model = My_model(*args, **kwargs)  // 这里需要重新模型结构，My_model
model.load_state_dict(torch.load(mymodel.pth)) // 这里根据模型结构，调用存储的模型参数
model.eval()

保存整个模型，包括模型结构+模型参数

// 保存
torch.save(model, mymodel.pth)  // 保存整个model的状态

// 调用
model=torch.load(mymodel.pth)  // 这里已经不需要重构模型结构了，直接load就可以
model.eval()

FAQ

UserWarning: CUDA initialization: CUDA unknown error - this may be due to an incorrectly set up environment, e.g. changing env variable CUDA_VISIBLE_DEVICES after program start. Setting the available devices to be zero

1
2
3

# works for me
sudo rmmod nvidia_uvm
sudo modprobe nvidia_uvm

Yolo Test on Ubuntu 18.04

2022-06-24T16:00:00.000Z

[TOC]

YOLO

Darknet: Open Source Neural Networks in C
YOLO: Real-Time Object Detection

Build / Install

Detection Using A Pre-Trained Model

1	`./darknet detect cfg/yolov3.cfg yolov3.weights data/dog.jpg`

output

layer     filters    size              input                output
    0 conv     32  3 x 3 / 1   608 x 608 x   3   ->   608 x 608 x  32  0.639 BFLOPs
    1 conv     64  3 x 3 / 2   608 x 608 x  32   ->   304 x 304 x  64  3.407 BFLOPs
    2 conv     32  1 x 1 / 1   304 x 304 x  64   ->   304 x 304 x  32  0.379 BFLOPs
    3 conv     64  3 x 3 / 1   304 x 304 x  32   ->   304 x 304 x  64  3.407 BFLOPs
    4 res    1                 304 x 304 x  64   ->   304 x 304 x  64
    5 conv    128  3 x 3 / 2   304 x 304 x  64   ->   152 x 152 x 128  3.407 BFLOPs
    6 conv     64  1 x 1 / 1   152 x 152 x 128   ->   152 x 152 x  64  0.379 BFLOPs
    7 conv    128  3 x 3 / 1   152 x 152 x  64   ->   152 x 152 x 128  3.407 BFLOPs
    8 res    5                 152 x 152 x 128   ->   152 x 152 x 128
    9 conv     64  1 x 1 / 1   152 x 152 x 128   ->   152 x 152 x  64  0.379 BFLOPs
  10 conv    128  3 x 3 / 1   152 x 152 x  64   ->   152 x 152 x 128  3.407 BFLOPs
  11 res    8                 152 x 152 x 128   ->   152 x 152 x 128
  12 conv    256  3 x 3 / 2   152 x 152 x 128   ->    76 x  76 x 256  3.407 BFLOPs
  13 conv    128  1 x 1 / 1    76 x  76 x 256   ->    76 x  76 x 128  0.379 BFLOPs
  14 conv    256  3 x 3 / 1    76 x  76 x 128   ->    76 x  76 x 256  3.407 BFLOPs
  15 res   12                  76 x  76 x 256   ->    76 x  76 x 256
  16 conv    128  1 x 1 / 1    76 x  76 x 256   ->    76 x  76 x 128  0.379 BFLOPs
  17 conv    256  3 x 3 / 1    76 x  76 x 128   ->    76 x  76 x 256  3.407 BFLOPs
  18 res   15                  76 x  76 x 256   ->    76 x  76 x 256
  19 conv    128  1 x 1 / 1    76 x  76 x 256   ->    76 x  76 x 128  0.379 BFLOPs
  20 conv    256  3 x 3 / 1    76 x  76 x 128   ->    76 x  76 x 256  3.407 BFLOPs
  21 res   18                  76 x  76 x 256   ->    76 x  76 x 256
  22 conv    128  1 x 1 / 1    76 x  76 x 256   ->    76 x  76 x 128  0.379 BFLOPs
  23 conv    256  3 x 3 / 1    76 x  76 x 128   ->    76 x  76 x 256  3.407 BFLOPs
  24 res   21                  76 x  76 x 256   ->    76 x  76 x 256
  25 conv    128  1 x 1 / 1    76 x  76 x 256   ->    76 x  76 x 128  0.379 BFLOPs
  26 conv    256  3 x 3 / 1    76 x  76 x 128   ->    76 x  76 x 256  3.407 BFLOPs
  27 res   24                  76 x  76 x 256   ->    76 x  76 x 256
  28 conv    128  1 x 1 / 1    76 x  76 x 256   ->    76 x  76 x 128  0.379 BFLOPs
  29 conv    256  3 x 3 / 1    76 x  76 x 128   ->    76 x  76 x 256  3.407 BFLOPs
  30 res   27                  76 x  76 x 256   ->    76 x  76 x 256
  31 conv    128  1 x 1 / 1    76 x  76 x 256   ->    76 x  76 x 128  0.379 BFLOPs
  32 conv    256  3 x 3 / 1    76 x  76 x 128   ->    76 x  76 x 256  3.407 BFLOPs
  33 res   30                  76 x  76 x 256   ->    76 x  76 x 256
  34 conv    128  1 x 1 / 1    76 x  76 x 256   ->    76 x  76 x 128  0.379 BFLOPs
  35 conv    256  3 x 3 / 1    76 x  76 x 128   ->    76 x  76 x 256  3.407 BFLOPs
  36 res   33                  76 x  76 x 256   ->    76 x  76 x 256
  37 conv    512  3 x 3 / 2    76 x  76 x 256   ->    38 x  38 x 512  3.407 BFLOPs
  38 conv    256  1 x 1 / 1    38 x  38 x 512   ->    38 x  38 x 256  0.379 BFLOPs
  39 conv    512  3 x 3 / 1    38 x  38 x 256   ->    38 x  38 x 512  3.407 BFLOPs
  40 res   37                  38 x  38 x 512   ->    38 x  38 x 512
  41 conv    256  1 x 1 / 1    38 x  38 x 512   ->    38 x  38 x 256  0.379 BFLOPs
  42 conv    512  3 x 3 / 1    38 x  38 x 256   ->    38 x  38 x 512  3.407 BFLOPs
  43 res   40                  38 x  38 x 512   ->    38 x  38 x 512
  44 conv    256  1 x 1 / 1    38 x  38 x 512   ->    38 x  38 x 256  0.379 BFLOPs
  45 conv    512  3 x 3 / 1    38 x  38 x 256   ->    38 x  38 x 512  3.407 BFLOPs
  46 res   43                  38 x  38 x 512   ->    38 x  38 x 512
  47 conv    256  1 x 1 / 1    38 x  38 x 512   ->    38 x  38 x 256  0.379 BFLOPs
  48 conv    512  3 x 3 / 1    38 x  38 x 256   ->    38 x  38 x 512  3.407 BFLOPs
  49 res   46                  38 x  38 x 512   ->    38 x  38 x 512
  50 conv    256  1 x 1 / 1    38 x  38 x 512   ->    38 x  38 x 256  0.379 BFLOPs
  51 conv    512  3 x 3 / 1    38 x  38 x 256   ->    38 x  38 x 512  3.407 BFLOPs
  52 res   49                  38 x  38 x 512   ->    38 x  38 x 512
  53 conv    256  1 x 1 / 1    38 x  38 x 512   ->    38 x  38 x 256  0.379 BFLOPs
  54 conv    512  3 x 3 / 1    38 x  38 x 256   ->    38 x  38 x 512  3.407 BFLOPs
  55 res   52                  38 x  38 x 512   ->    38 x  38 x 512
  56 conv    256  1 x 1 / 1    38 x  38 x 512   ->    38 x  38 x 256  0.379 BFLOPs
  57 conv    512  3 x 3 / 1    38 x  38 x 256   ->    38 x  38 x 512  3.407 BFLOPs
  58 res   55                  38 x  38 x 512   ->    38 x  38 x 512
  59 conv    256  1 x 1 / 1    38 x  38 x 512   ->    38 x  38 x 256  0.379 BFLOPs
  60 conv    512  3 x 3 / 1    38 x  38 x 256   ->    38 x  38 x 512  3.407 BFLOPs
  61 res   58                  38 x  38 x 512   ->    38 x  38 x 512
  62 conv   1024  3 x 3 / 2    38 x  38 x 512   ->    19 x  19 x1024  3.407 BFLOPs
  63 conv    512  1 x 1 / 1    19 x  19 x1024   ->    19 x  19 x 512  0.379 BFLOPs
  64 conv   1024  3 x 3 / 1    19 x  19 x 512   ->    19 x  19 x1024  3.407 BFLOPs
  65 res   62                  19 x  19 x1024   ->    19 x  19 x1024
  66 conv    512  1 x 1 / 1    19 x  19 x1024   ->    19 x  19 x 512  0.379 BFLOPs
  67 conv   1024  3 x 3 / 1    19 x  19 x 512   ->    19 x  19 x1024  3.407 BFLOPs
  68 res   65                  19 x  19 x1024   ->    19 x  19 x1024
  69 conv    512  1 x 1 / 1    19 x  19 x1024   ->    19 x  19 x 512  0.379 BFLOPs
  70 conv   1024  3 x 3 / 1    19 x  19 x 512   ->    19 x  19 x1024  3.407 BFLOPs
  71 res   68                  19 x  19 x1024   ->    19 x  19 x1024
  72 conv    512  1 x 1 / 1    19 x  19 x1024   ->    19 x  19 x 512  0.379 BFLOPs
  73 conv   1024  3 x 3 / 1    19 x  19 x 512   ->    19 x  19 x1024  3.407 BFLOPs
  74 res   71                  19 x  19 x1024   ->    19 x  19 x1024
  75 conv    512  1 x 1 / 1    19 x  19 x1024   ->    19 x  19 x 512  0.379 BFLOPs
  76 conv   1024  3 x 3 / 1    19 x  19 x 512   ->    19 x  19 x1024  3.407 BFLOPs
  77 conv    512  1 x 1 / 1    19 x  19 x1024   ->    19 x  19 x 512  0.379 BFLOPs
  78 conv   1024  3 x 3 / 1    19 x  19 x 512   ->    19 x  19 x1024  3.407 BFLOPs
  79 conv    512  1 x 1 / 1    19 x  19 x1024   ->    19 x  19 x 512  0.379 BFLOPs
  80 conv   1024  3 x 3 / 1    19 x  19 x 512   ->    19 x  19 x1024  3.407 BFLOPs
  81 conv    255  1 x 1 / 1    19 x  19 x1024   ->    19 x  19 x 255  0.189 BFLOPs
  82 yolo
  83 route  79
  84 conv    256  1 x 1 / 1    19 x  19 x 512   ->    19 x  19 x 256  0.095 BFLOPs
  85 upsample            2x    19 x  19 x 256   ->    38 x  38 x 256
  86 route  85 61
  87 conv    256  1 x 1 / 1    38 x  38 x 768   ->    38 x  38 x 256  0.568 BFLOPs
  88 conv    512  3 x 3 / 1    38 x  38 x 256   ->    38 x  38 x 512  3.407 BFLOPs
  89 conv    256  1 x 1 / 1    38 x  38 x 512   ->    38 x  38 x 256  0.379 BFLOPs
  90 conv    512  3 x 3 / 1    38 x  38 x 256   ->    38 x  38 x 512  3.407 BFLOPs
  91 conv    256  1 x 1 / 1    38 x  38 x 512   ->    38 x  38 x 256  0.379 BFLOPs
  92 conv    512  3 x 3 / 1    38 x  38 x 256   ->    38 x  38 x 512  3.407 BFLOPs
  93 conv    255  1 x 1 / 1    38 x  38 x 512   ->    38 x  38 x 255  0.377 BFLOPs
  94 yolo
  95 route  91
  96 conv    128  1 x 1 / 1    38 x  38 x 256   ->    38 x  38 x 128  0.095 BFLOPs
  97 upsample            2x    38 x  38 x 128   ->    76 x  76 x 128
  98 route  97 36
  99 conv    128  1 x 1 / 1    76 x  76 x 384   ->    76 x  76 x 128  0.568 BFLOPs
  100 conv    256  3 x 3 / 1    76 x  76 x 128   ->    76 x  76 x 256  3.407 BFLOPs
  101 conv    128  1 x 1 / 1    76 x  76 x 256   ->    76 x  76 x 128  0.379 BFLOPs
  102 conv    256  3 x 3 / 1    76 x  76 x 128   ->    76 x  76 x 256  3.407 BFLOPs
  103 conv    128  1 x 1 / 1    76 x  76 x 256   ->    76 x  76 x 128  0.379 BFLOPs
  104 conv    256  3 x 3 / 1    76 x  76 x 128   ->    76 x  76 x 256  3.407 BFLOPs
  105 conv    255  1 x 1 / 1    76 x  76 x 256   ->    76 x  76 x 255  0.754 BFLOPs
  106 yolo
Loading weights from yolov3.weights...Done!
data/dog.jpg: Predicted in 18.148226 seconds.
dog: 100%
truck: 92%
bicycle: 99%

fixed CUDA Error with the cfg code below
out of memory
1
2
# cfg/yolov3.cfg batch=16

配置文件 yolov3.cfg 内容如下

[net]
batch=64                           # 每batch个样本更新一次参数。
subdivisions=8                     # 如果内存不够大，将batch分割为subdivisions个子batch，每个子batch的大小为batch/subdivisions。
                                   # 在darknet代码中，会将batch/subdivisions命名为batch。
height=416                         # input图像的高
width=416                          # Input图像的宽
channels=3                         # Input图像的通道数
momentum=0.9                       # 动量
decay=0.0005                       # 权重衰减正则项，防止过拟合
angle=0                            # 通过旋转角度来生成更多训练样本
saturation = 1.5                   # 通过调整饱和度来生成更多训练样本
exposure = 1.5                     # 通过调整曝光量来生成更多训练样本
hue=.1                             # 通过调整色调来生成更多训练样本

learning_rate=0.0001               # 初始学习率
max_batches = 45000                # 训练达到max_batches后停止学习
policy=steps                       # 调整学习率的policy，有如下policy：CONSTANT, STEP, EXP, POLY, STEPS, SIG, RANDOM
steps=100,25000,35000              # 根据batch_num调整学习率
scales=10,.1,.1                    # 学习率变化的比例，累计相乘

[convolutional]
batch_normalize=1                  # 是否做BN
filters=32                         # 输出多少个特征图
size=3                             # 卷积核的尺寸
stride=1                           # 做卷积运算的步长
pad=1                              # 如果pad为0,padding由 padding参数指定。如果pad为1，padding大小为size/2
activation=leaky                   # 激活函数：
                                   # logistic，loggy，relu，elu，relie，plse，hardtan，lhtan，linear，ramp，leaky，tanh，stair

[maxpool]
size=2                             # 池化层尺寸
stride=2                           # 池化步进

[convolutional]
batch_normalize=1
filters=64
size=3
stride=1
pad=1
activation=leaky

[maxpool]
size=2
stride=2

......
......

#######

[convolutional]
batch_normalize=1
size=3
stride=1
pad=1
filters=1024
activation=leaky

[convolutional]
batch_normalize=1
size=3
stride=1
pad=1
filters=1024
activation=leaky

[route]                            # the route layer is to bring finer grained features in from earlier in the network
layers=-9

[reorg]                            # the reorg layer is to make these features match the feature map size at the later layer. 
                                   # The end feature map is 13x13, the feature map from earlier is 26x26x512. 
                                   # The reorg layer maps the 26x26x512 feature map onto a 13x13x2048 feature map 
                                   # so that it can be concatenated with the feature maps at 13x13 resolution.
stride=2

[route]
layers=-1,-3

[convolutional]
batch_normalize=1
size=3
stride=1
pad=1
filters=1024
activation=leaky

[convolutional]
size=1
stride=1
pad=1
filters=125                        # region前最后一个卷积层的filters数是特定的，计算公式为filter=num*(classes+5) 
                                   # 5的意义是5个坐标，论文中的tx,ty,tw,th,to
activation=linear

[region]
anchors = 1.08,1.19,  3.42,4.41,  6.63,11.38,  9.42,5.11,  16.62,10.52          # 预选框，可以手工挑选，
                                                                                # 也可以通过k means 从训练样本中学出
bias_match=1
classes=20                         # 网络需要识别的物体种类数
coords=4                           # 每个box的4个坐标tx,ty,tw,th
num=5                              # 每个grid cell预测几个box,和anchors的数量一致。当想要使用更多anchors时需要调大num，且如果调大num后训练时Obj趋近0的话可以尝试调大object_scale
softmax=1                          # 使用softmax做激活函数
jitter=.2                          # 通过抖动增加噪声来抑制过拟合
rescore=1                          # 暂理解为一个开关，非0时通过重打分来调整l.delta（预测值与真实值的差）

object_scale=5                     # 栅格中有物体时，bbox的confidence loss对总loss计算贡献的权重
noobject_scale=1                   # 栅格中没有物体时，bbox的confidence loss对总loss计算贡献的权重
class_scale=1                      # 类别loss对总loss计算贡献的权重                      
coord_scale=1                      # bbox坐标预测loss对总loss计算贡献的权重

absolute=1
thresh = .6
random=0                           # random为1时会启用Multi-Scale Training，随机使用不同尺寸的图片进行训练。

Training YOLO on VOC

Yolo for ROS

https://github.com/leggedrobotics/darknet_ros
YOLO ROS: Real-Time Object Detection for ROS

cfg files

darknet_ros.launch




<arg name="ros_param_file"             default="$(find darknet_ros)/config/ros.yaml"/>
<arg name="network_param_file"         default="$(find darknet_ros)/config/yolov3.yaml"/>

ros.yaml

# ros.yaml
subscribers:
  camera_reading:
    topic: /camera/zed/rgb/image_rect_color
    queue_size: 1

Build

cd ws_yolo/src
git clone --recursive git@github.com:leggedrobotics/darknet_ros.git
cd ../

catkin_make -DCMAKE_BUILD_TYPE=Release
# or
catkin build darknet_ros -DCMAKE_BUILD_TYPE=Release

Image Exposure Controller

2022-06-21T16:00:00.000Z

msckf-vio Issue: About image exposure controller #41 ^[1]

The auto exposure code is not open source because for one the actual implementation is ugly, and second because it ties into other software that is not open sourced.

Here a few (edited) notes that I sent to somebody who asked about how it works:

what we implemented is super simple but works well in practice:
\[ \text{new-shutter-time} = \text{current-shutter-time} \cdot \frac{\text{desired-brightness}}{\text{current-brightness}} \]
You can go fancier by implementing
\[ \text{new-shutter-time} = \text{current-shutter-time} \cdot \left(\frac{\text{desired-brightness}}{\text{current-brightness}}\right)^\alpha\]
\(\alpha \leq 1\), but in practice \(\alpha = 1\) works just fine.
brightness is computed from every 16th row and column, so in fact only 1 out of every 256 is used for brightness computation
threshold on brightness:
we only change shutter time if
\[ |\text{desired-brightness} - \text{current-brightness}| > \text{threshold}\]
This is so that we don't torture the ptgrey cameras with constant shutter speed changes.
region of interest (ROI) support: we compute brighness only from the bottom X%. Typically X=70, so we will ignore the top 30% of the image, because that's often where the sun or sky are, and where we don't usually pick up features.
configurable max shutter limit: the max shutter time is the min of the shutter time imposed by frame rate, and a configurable hard shutter limit (in case we get motion blur and decide we'd rather live with a darker image and/or gain noise than the motion blur).
auto gain: once we hit max shutter, we turn on auto gain (and when it get's brighter again, we take the gain off first, then decrease shutter).

\[\text{if} \; \left| \bar{B} - B_c \right| > B_{th}: \\ \; T_s \leftarrow T_s \cdot \frac{\bar{B}}{B_c} \\ \; \text{if} \; T_s < T_{min}: \\ \text{auto gain}\]

Implementation

image_exposure_control for RS cam ^[2]
vo-autoexpose ^[3] : An auto-exposure algorithm for maxing out VO performance in challenging light conditions

Reference

A-LOAM

2022-06-18T16:00:00.000Z

[TOC]

Overview

A-LOAM is an Advanced implementation of LOAM (J. Zhang and S. Singh. LOAM: Lidar Odometry and Mapping in Real-time), which uses Eigen and Ceres Solver to simplify code structure.

Code: A-LOAM 注释版

ROS Graph

Pipeline

Lidar Hardware

Hokuyo UTM-30LX

Vertical
- sweep: \(180^\circ / s\), a rotation from \(-90^\circ\) to \(90^\circ\) or in the inverse direction (lasting for 1s)
- FOV: \(180^\circ\)
- scan rate: 40 lines/sec
- resolution: \(180^\circ / 40 = 4.5^\circ\)
Horizontal (a scan plane)
- resolution: \(0.25^\circ\) within a scan
angular speed: \(180^\circ\) between \(-90^\circ\) and \(90^\circ\) with the horizontal orientation of the laser scanner as zero

VLP-16

Time of flight distance measurement with calibrated reflectivities
16 channels
Measurement range up to 100 meters
Accuracy: +/- 3 cm (typical)
Dual returns
Field of view (vertical): 30° (+15° to -15°)
Angular resolution (vertical): 2°
Field of view (horizontal/azimuth): 360°
Angular resolution (horizontal/azimuth): 0.1° - 0.4°
Rotation rate: 5 - 20 Hz

Scan Registration

数据预处理

数据清洗

1 2	`pcl::removeNaNFromPointCloud(laserCloudIn, laserCloudIn, indices); removeClosedPointCloud(laserCloudIn, laserCloudIn, MINIMUM_RANGE);`

按线数保存的点云集合

1
2
3

float relTime = (ori - startOri) / (endOri - startOri);
point.intensity = scanID + scanPeriod * relTime;
laserCloudScans[scanID].push_back(point);

曲率计算 (使用每个点的前后五个点)

for (int i = 5; i < cloudSize - 5; i++) {
  // ...
  cloudCurvature[i] = diffX * diffX + diffY * diffY + diffZ * diffZ;
  cloudSortInd[i] = i;
  cloudNeighborPicked[i] = 0;
  cloudLabel[i] = 0;
}

特征提取

根据曲率进行点云特征提取，将每条线上的点分入相应的类别：边沿点和平面点

sharp edges
planar surface patches

对于每条线

将每个scan的曲率点分成6等份处理,确保周围都有点被选作特征点

对于每一份，曲率大于0.1的点

挑选曲率最大的前2个点放入sharp点集合 cornerPointsSharp，同时 cloudLabel[ind] = 2
挑选曲率最大的前20个点放入less sharp点集合 cornerPointsLessSharp，同时 cloudLabel[ind] = 1
点的前后各5个连续距离比较近的点筛选出去，防止特征点聚集，使得特征点在每个方向上尽量分布均匀

对于每一份，曲率小于0.1的点

放入flat点集合 surfPointsFlat，同时 cloudLabel[ind] = -1
点的前后各5个连续距离比较近的点筛选出去，防止特征点聚集，使得特征点在每个方向上尽量分布均匀

对于每一份，将剩余的点 cloudLabel[k] <= 0（包括之前被排除的点）全部归入平面点 surfPointsLessFlatScan

Odometry（高频率，粗定位）

运动畸变矫正

运动畸变示意图如下

Reprojecting point cloud to the end of a sweep

void TransformToStart(PointType const *const pi, PointType *const po) {
  double s;
  if (DISTORTION)
    s = (pi->intensity - int(pi->intensity)) / SCAN_PERIOD;
  else
    s = 1.0;

  Eigen::Quaterniond q_point_last = Eigen::Quaterniond::Identity().slerp(s, q_last_curr);
  Eigen::Vector3d t_point_last = s * t_last_curr;
  Eigen::Vector3d point(pi->x, pi->y, pi->z);
  Eigen::Vector3d un_point = q_point_last * point + t_point_last;

  po->x = un_point.x();
  po->y = un_point.y();
  po->z = un_point.z();
  po->intensity = pi->intensity;
}

void TransformToEnd(PointType const *const pi, PointType *const po) {
  // undistort point first
  pcl::PointXYZI un_point_tmp;
  TransformToStart(pi, &un_point_tmp);

  Eigen::Vector3d un_point(un_point_tmp.x, un_point_tmp.y, un_point_tmp.z);
  Eigen::Vector3d point_end = q_last_curr.inverse() * (un_point - t_last_curr);

  po->x = point_end.x();
  po->y = point_end.y();
  po->z = point_end.z();

  po->intensity = int(pi->intensity);
}

特征匹配 (Scan-Scan)

correspondence for corner features

当前点 curr_point 与线段匹配，找到线段的两个端点

last_point_a: KDTree 搜索最近的点
last_point_b: 在scan增长和下降的方向上分别搜索，不在同一scan但处于一定阈值scan范围内，距离最小的点

correspondence for plane features

当前点 curr_point 与面匹配，找到面的三个点

last_point_a: KDTree 搜索最近的点
last_point_b: 在scan增长(intensity<=closestPointScanID)和下降(intensity>=closestPointScanID)的方向上分别搜索，处于一定阈值scan范围内，距离最小的点
last_point_c: 在scan增长(intensity>closestPointScanID)和下降(intensity)的方向上分别搜索，处于一定阈值scan范围内，距离最小的点

`运动估计 ICP`

残差度量方式

点到线段距离
点到面距离

`Mapping（低频率，精定位）`

`基于Cube的地图管理`

LOAM采用的是栅格（cube）地图的方法，将整个地图分成21×21×11个珊格，每个珊格是⼀个边⻓50m的正⽅体，当地图逐渐累加时，珊格之外的部分就被舍弃，这样可以保证内存空间不会随着程序的运⾏⽽爆掉，同时保证效率。

`特征匹配 (Scan-Map)`

将当前帧已经消除畸变的点云转换到全局坐标系 transformAssociateToMap()，然后与局部地图（local map或者称为submap，源码中使用的是三维栅格cube做的局部地图管理）做特征匹配

用于特征匹配的局部地图 (local map)

int laserCloudValidNum = 0;
int laserCloudSurroundNum = 0;
// 在每一维附近5个cube(前2个，后2个，中间1个)里进行查找
for (int i = centerCubeI - 2; i <= centerCubeI + 2; i++) {
  for (int j = centerCubeJ - 2; j <= centerCubeJ + 2; j++) {
    for (int k = centerCubeK - 1; k <= centerCubeK + 1; k++) {
      if (i >= 0 && i < laserCloudWidth && j >= 0 && j < laserCloudHeight && k >= 0 && k < laserCloudDepth) {
        laserCloudValidInd[laserCloudValidNum] = i + laserCloudWidth * j + laserCloudWidth * laserCloudHeight * k;
        laserCloudValidNum++;
        laserCloudSurroundInd[laserCloudSurroundNum] = i + laserCloudWidth * j + laserCloudWidth * laserCloudHeight * k;
        laserCloudSurroundNum++;
      }
    }
  }
}

// 构建特征点地图，查找匹配使用
for (int i = 0; i < laserCloudValidNum; i++) {
  *laserCloudCornerFromMap += *laserCloudCornerArray[laserCloudValidInd[i]];
  *laserCloudSurfFromMap += *laserCloudSurfArray[laserCloudValidInd[i]];
}

`correspondence for corner features`

当前点 curr_point 与线段匹配，找到线段的两个端点

KDTree 搜索最近的5个点（最远点距离小于1米），计算其中心点 center，并构建协方差矩阵；如果是线特征，协方差矩阵最大特征值对应的特征向量即为线的方向向量 unit_direction，然后根据中心点和方向向量得到两个端点

last_point_a
last_point_b

`correspondence for plane features`

当前点 curr_point 与面匹配，找到面的法向量

KDTree 搜索最近的5个点（最远点距离小于1米），计算面的法向量

`运动估计 ICP`

残差度量方式

点到线段距离
点到面距离

计算出的位姿修正Odometry的位姿

`地图增长`

获得 laserCloudCornerArray 和 laserCloudSurfArray，并降采样；当地图逐渐累加时，珊格之外的部分就被舍弃，这样可以保证内存空间不会随着程序的运⾏⽽爆掉，同时保证效率。

for (int i = 0; i < laserCloudCornerStackNum; i++) {
  pointAssociateToMap(&laserCloudCornerStack->points[i], &pointSel);

  int cubeI = int((pointSel.x + 25.0) / 50.0) + laserCloudCenWidth;
  int cubeJ = int((pointSel.y + 25.0) / 50.0) + laserCloudCenHeight;
  int cubeK = int((pointSel.z + 25.0) / 50.0) + laserCloudCenDepth;

  if (pointSel.x + 25.0 < 0) cubeI--;
  if (pointSel.y + 25.0 < 0) cubeJ--;
  if (pointSel.z + 25.0 < 0) cubeK--;

  if (cubeI >= 0 && cubeI < laserCloudWidth && cubeJ >= 0 && cubeJ < laserCloudHeight && cubeK >= 0 &&
      cubeK < laserCloudDepth) {
    int cubeInd = cubeI + laserCloudWidth * cubeJ + laserCloudWidth * laserCloudHeight * cubeK;
    laserCloudCornerArray[cubeInd]->push_back(pointSel);
  }
}

for (int i = 0; i < laserCloudSurfStackNum; i++) {
  pointAssociateToMap(&laserCloudSurfStack->points[i], &pointSel);

  int cubeI = int((pointSel.x + 25.0) / 50.0) + laserCloudCenWidth;
  int cubeJ = int((pointSel.y + 25.0) / 50.0) + laserCloudCenHeight;
  int cubeK = int((pointSel.z + 25.0) / 50.0) + laserCloudCenDepth;

  if (pointSel.x + 25.0 < 0) cubeI--;
  if (pointSel.y + 25.0 < 0) cubeJ--;
  if (pointSel.z + 25.0 < 0) cubeK--;

  if (cubeI >= 0 && cubeI < laserCloudWidth && cubeJ >= 0 && cubeJ < laserCloudHeight && cubeK >= 0 &&
      cubeK < laserCloudDepth) {
    int cubeInd = cubeI + laserCloudWidth * cubeJ + laserCloudWidth * laserCloudHeight * cubeK;
    laserCloudSurfArray[cubeInd]->push_back(pointSel);
  }
}



OpenMVG + OpenMVS build & run
2022-06-17T16:00:00.000Z
[TOC]
Overview
OpenMVG provides an end-to-end 3D reconstruction from images framework compounded of libraries, binaries, and pipelines.
a library mainly focused on Multiple-View-Geometry and Structure-From-Motion
Structure-from-Motion pipelines (like OpenMVG) which recover camera poses and a sparse 3D point-cloud from an input set of images, there are none addressing the last part of the photogrammetry chain-flow
OpenMVS (Multi-View Stereo) is a library for computer-vision scientists and especially targeted to the Multi-View Stereo reconstruction community.
Modules
aims at filling that gap by providing a complete set of algorithms to recover the full surface of the scene to be reconstructed
The input is a set of camera poses plus the sparse point-cloud and the output is a textured mesh
OpenMVG
Build
https://github.com/openMVG/openMVG/blob/develop/BUILD.md
Run
OpenMVG on your image dataset
Modify the code below in SfM_SequentialPipeline.py, SfM_GlobalPipeline.py or tutorial_demo.py
1
2
3
4
5
# before
pIntrisics = subprocess.Popen( [os.path.join(OPENMVG_SFM_BIN, "openMVG_main_SfMInit_ImageListing"),  "-i", input_dir, "-o", matches_dir, "-d", camera_file_params] )

# after
pIntrisics = subprocess.Popen( [os.path.join(OPENMVG_SFM_BIN, "openMVG_main_SfMInit_ImageListing"),  "-i", input_dir, "-o", matches_dir, "-d", camera_file_params, "-f", "xxx"] )
then
1
python SfM_GlobalPipeline.py [full path image directory] [resulting directory]
MVG (SfM scene) to MVS
1
openMVG_main_openMVG2openMVS -i sfm_data.bin -o scene.mvs
OpenMVS
Build
https://github.com/cdcseacave/openMVS/wiki/Building
Run
OpenMVS_sample
View
Viewer module can be used to visualize any MVS project file or PLY/OBJ file.
1
./openMVS_build/bin/Viewer xxx.mvs # or xxx.ply xxx.obj
Ref
OpenMVG与OpenMVS安装配置、简单使用


Mesh Texturing in a Nutshell (Let There Be Color)
2022-06-11T16:00:00.000Z
[TOC]
Overview
digraph { TV [label="TextureView"]; TVs [label="N TextureViews"]; Texturing [style=filled, shape=box]; ColorImg->TV; CamK->TV; CamTF->TV; TV->TVs; TVs->Input; TriangleMesh->Input; Input->Texturing->Output->TexturedMesh; }
code (forked): https://github.com/cggos/mvs-texturing
paper: Let There Be Color! Large-Scale Texturing of 3D Reconstructions
video: https://www.youtube.com/watch?v=Ie-qLJdmlLI
1. Texture Views
1
tex::generate_texture_views()
digraph { TV [label="TextureView"]; TV->ColorImg; TV->CamK; TV->CamTF; }
2. Mesh --> MeshInfo
1
tex::prepare_mesh()
Check Mesh
1
TriangleMesh::ensure_normals()
Ensure face and vertex normals
Init MeshInfo
1
MeshInfo::initialize()
Create VertexInfo
Add faces to their three vertices
digraph { vertex -> face1 vertex -> face2 vertex -> face3 }
Update VertexInfo
Classify each vertex and compute adjacenty info
Build new, temporary adjacent faces representation AdjacentFaceList adj_temp for ordering
digraph { face_id [color=green]; front_vid [color=blue]; back_vid [color=blue];
AdjFaceTmp->face_id AdjFaceTmp->front_vid AdjFaceTmp->back_vid }
graph { layout=twopi; node [shape=circle];
v0 [color="red"]; v1 [color="blue"]; v2 [color="blue"];
v0--v1 [color=green]; v0--v2 [color=green]; v0--v3; v0--v4; v1--v2 [color=green]; v3--v2; v3--v4; v1--v4;
overlap=false; }
Sort adjacent faces by chaining them
1
AdjacentFaceList adj_sorted;
update VertexInfo
digraph { vclass; verts [color=blue]; faces [color=green];
vinfo->vclass; vinfo->verts; vinfo->faces; }
3. Mesh + MeshInfo --> Adjacency Graph (UniGraph)
1
tex::build_adjacency_graph()
对于每个 face，将mesh中与其每条 edge 邻接的 face 存入 adj_faces；将当前 face 与 adj_faces 中每个 face 建立 edge，构建 UniGraph 。
graph { node [shape=circle];
f0--f1; f1--f2; f0--f3; f3--f4; f1--f4;
overlap=false; }
4. View Selection --> Best View Label 😄
1
2
3
tex::calculate_data_costs()

tex::view_selection()
Calculate DataCosts
Calculates the data costs for each face and texture view combination, if the face is visible within the texture view.
1
2
3
FaceProjectionInfos face_projection_infos(num_faces);
calculate_face_projection_infos(mesh, texture_views, settings, &face_projection_infos);
postprocess_face_infos(settings, &face_projection_infos, data_costs);
Calculate FaceProjectionInfo
1
2
3
4
5
6
7
8
9
for (std::uint16_t j = 0; j < static_castuint16_t>(num_views); ++j) {
  TextureView * texture_view = &texture_views->at(j);
  // get view_pos and view_dir
  for (std::size_t i = 0; i < faces.size(); i += 3) {
    // get face_normal and face_center
    // compute and check viewing_angle
    // get face info
  }
}
digraph { face_info [style=filled]; nnn [label="..."]; face_info_n [style=filled]; rankdir=LR; face_id->face_info; face_info->view_id; face_info->mean_color; face_info->quality; face_id->nnn; face_id->face_info_n; }
PostProcess Face Infos
create hist_qualities::Histogram using info.quality, and get the upper_bound when percentile=0.995
compute data cost
gmi
area
1
2
3
float normalized_quality = std::min(1.0f, info.quality / percentile);
float data_cost = (1.0f - normalized_quality);
data_costs->set_value(i, info.view_id, data_cost);
DataCost face0 face1 ... faceN
view0 
view1 
... 
viewN 
View Selection
Data Association
Graph mapmap::Graph
graph { rankdir = LR; face_id--adj_face_id [label="weight"]; }
LabelSet mapmap::LabelSet
view id face0 face1 ... faceN
view0 
view1 
... 
viewN 
Unaries
1
2
using unary_t = mapmap::UnaryTable<cost_t, simd_w>;
std::vector<unary_t> unaries;
face_id label_set costs
unary0 
unary1 
... 
unaryN 
Pairwise
1
2
using pairwise_t = mapmap::PairwisePotts<cost_t, simd_w>;
pairwise_t pairwise(1.0f);
MAP-MRF 🚩
1
2
3
4
5
6
7
8
9
mapmap::mapMAP<cost_t, simd_w> solver;
solver.set_graph(&mgraph);
solver.set_label_set(&label_set);
for(std::size_t i = 0; i < graph->num_nodes(); ++i)
    solver.set_unary(i, &unaries[i]);
solver.set_pairwise(&pairwise);
solver.set_logging_callback(display);
solver.set_termination_criterion(&terminate);
solver.optimize(solution, ctr);
The aim is to find a labeling for X that produces the lowest energy.
pairwise MRFs
the filled-in circles: the observed nodes \(Y_i\) (face)
the empty circles: the "hidden" nodes \(X_i\) (view label)
MAP --> Minimum Energy
energy/cost function:
\[\text{energy} (Y, X) = \sum_{i} \text{DataCost} (y_i, x_i) + \sum_{j = \text{neighbours of i}} \text{SmoothnessCost} (x_i, x_j)\]
Tree MRFs via DP
LBP
by OpenMVS
5. Create Texture Atlases 😄
1
2
3
4
5
6
tex::generate_texture_patches()

tex::global_seam_leveling()
tex::local_seam_leveling()

tex::generate_texture_atlases()
Generate Texture Patches
Generates texture patches using the graph to determine adjacent faces with the same label.
Global / Local Seam Levelling 🚩
paper: Seamless Mosaicing of Image-Based Texture Maps
without seam levelling
Texture Atlases
generate TextureAtlas from all of TexturePatch
6. Mesh + Texture --> Obj Model
1
2
3
tex::build_model()

tex::Model::save()
.obj
.mtl
.png
网格UV展开
上述纹理重建属于 计算机视觉 的内容，本节是其逆过程，属于 计算机图形学 的内容。
http://geometryhub.net/notes/uvunfold
Reference
UV的概念及作用
【Let It Be Color！——3D重建之纹理重建】02-基于映射的纹理重建算法（上）
https://github.com/tyluann/3DTexture
https://zhuanlan.zhihu.com/p/44424934


Kinect Fusion
2022-06-09T16:00:00.000Z
[TOC]
Overview
Kinect Fusion 描述三维空间的方式叫 Volumetric。它把固定大小的一个空间（比如3𝑚×3𝑚×3𝑚）均匀分割成一个个小方块（比如512×512×512），每个小方块就是一个voxel，存储TSDF值以及权重。最终得到的三维重建就是对这些voxel进行线性插值。
重建流程如上图所示：
Depth Map Conversion: 读入的深度图像转换为三维点云并且计算每一点的法向量
Camera Tracking (map-to-frame ICP): 计算得到的带有法向量的点云，和通过光线投影算法根据上一帧位姿从模型投影出来的点云，利用 ICP 算法配准计算位姿
Volumetric Integration: 根据计算得到的位姿，将当前帧的点云融合到网格模型中去，这里用到了TSDF
Raycasting (光线投影算法): 根据当前帧相机位姿利用该算法从模型投影得到当前帧视角下的点云，并且计算其法向量，用来对下一帧的输入图像配准
如此是个循环的过程，通过移动相机获取场景不同视角下的点云，重建完整的场景表面。
Project:
KinectFusion Project Page [Microsoft]
KinectFusion: real-time 3D reconstruction and interaction using a moving depth camera
Depth Map Conversion
构建三层金字塔的目的是为了从粗到细地计算相机位置姿态，有加速计算的效果。
Camera Tracking (ICP)
相机的位置姿态是用ICP (Iterative Closest Point) 求解的。ICP是处理点云的常规手段，通过最小化两块点云的差别，迭代求解出拍摄两块点云的相机之间的相对位置。
有不同的方式来描述点云的差别，最常用的是point-to-point和point-to-plane两种。KinectFusion选择的是point-to-plane的方式，要把点到点的距离向法向量投影。point-to-plane要比point-to-point收敛速度快很多，而且更鲁棒。
KinectFusion 算法采用 frame-to-model （通过当前帧深度图像转换得到的点云，和根据上一帧相机位姿从模型投影获取的深度图像转换得到的点云进行配准）的方式，而不是采用 frame-to-frame （通过当前帧深度图像转换得到的点云，和上一帧深度图像转换得到的点云进行配准）的形式计算两帧位姿，作者论文里也验证了采用 frame-to-model 的形式重建要更加准确。
假设pose estimation已经计算出来，就可以把本次测量结果融合到全局地图（global model）中了。
这里的model使用的是TSDF地图。
Volumetric Integration (TSDF)
Surface Reconstruction (Raycast TSDF)
更新完TSDF值之后，就可以用TSDF来估计 voxel/normal map。这样估计出来的voxel/normal map比直接用RGBD相机得到的深度图有更少的噪音，更少的孔洞（RGBD相机会有一些无效的数据，点云上表现出来的就是黑色的孔洞）。估计出的voxel/normal map与新一帧的测量值一起可以估算相机的位置姿态。
Ray-Casting
具体的表面估计方法叫Raycasting。这种方法模拟观测位置有一个相机，从每个像素按内参𝐾投射出一条射线，射线穿过一个个voxel，在射线击中表面时，必然穿过TSDF值为一正一负的两个紧邻的voxel（因为射线和表面的交点的TSDF值为0），表面就夹在这两个voxel里面。然后可以利用线性插值，根据两个voxel的位置和TSDF值求出精确的交点位置。这些交点的集合就呈现出三维模型的表面。
如图：从光心出发，穿过像素点在网格模型中从正到负的穿越点，就表示在当前像素点处可以看到的重建好的场景的表面。对于每个像素点，分别做类似的投影，就可以计算得到的在每个像素点处的点云。
采用光线投影算法计算得到的点云，再计算其法向量，用带法向量的点云和下一帧的输入图像配准，计算下一帧输入图像的位姿。如此是个循环的过程。
其实，到此KinectFusion的流程已经结束了，重建出了点云格式的表面三维模型；但在实际应用中，尤其AR领域，还需要Mesh格式的三维模型，甚至需要纹理贴图等。
Mesh Generation
Marching Cube
通过Marching Cube对重建后的点云实现三角面片重建。
点云数据在三维空间中为离散表示，对TSDF地图使用Marching Cube算法来对等值面进行提取，实现三角面片重建。
Marching Cube算法基本思想是逐个处理标量场中的体素，分离出与等值面相交的体素，采用插值计算出等值面与立方体边的交点。根据立方体每一顶点与等值面的相对位置，将等值面与立方体边的交点按一定方式连接生成等值面，作为等值面在该立方体内的一个逼近表示。
Marching Cube用来提取TSDF体素中隐含存储的三维网格模型，实际上是提取TSDF中的0等值曲面。首先，遍历操作，通过遍历TSDF网格，定位并记录下与0等值曲面相交的体素点；然后，提取操作，对于前面记录下来的体素点，利用预存的网格索引及线性插值方法，提取出三角形面片网格，得到重建的三维几何模型。
Texturing
上面 KinectFusion 的几个步骤属于 几何重建 的过程，而在实际中，尤其AR领域，还要给 三维模型 进行 纹理重建 (纹理贴图)。
Other OS Code
Volumetric TSDF Fusion of RGB-D Images in Python
KFusion 0.4
Kintinuous: Spatially Extended KinectFusion
KinFuhttps://github.com/Nerei/kinfu_remake
OpenCV KinFu: https://github.com/opencv/opencv_contrib/blob/master/modules/rgbd/src/kinfu.cpp
PCL KinFu: https://github.com/PointCloudLibrary/pcl/tree/master/gpu/kinfu
https://github.com/sjy234sjy234/KinectFusion-ios


CMake for Visual Studio
2022-05-31T16:00:00.000Z
[TOC]
Property Sheet
example main.props, 在此文件 设置工程，方便分享
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
"1.0" encoding="utf-8"?>
<Project DefaultTargets="Build" xmlns="http://schemas.microsoft.com/developer/msbuild/2003">

  
  <ImportGroup Label="PropertySheets">
    <Import Project="$(UserRootDir)\Microsoft.Cpp.$(Platform).user.props" Condition="exists('$(UserRootDir)\Microsoft.Cpp.$(Platform).user.props')" Label="LocalAppDataPlatform" />
  ImportGroup>

  <PropertyGroup Label="UserMacros" />

  <PropertyGroup>
    <IncludePath> 
      
    IncludePath>
    <LibraryPath>
      
      $(LibraryPath)
    LibraryPath>
  PropertyGroup>

  <ItemDefinitionGroup>
    <Link>
      <AdditionalDependencies>
        
        %(AdditionalDependencies)
      AdditionalDependencies>
      <AdditionalLibraryDirectories>
        
      AdditionalLibraryDirectories>
    Link>
  ItemDefinitionGroup>

  <ItemGroup />
Project>    
CMakeLists.txt
example file
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
cmake_minimum_required(VERSION 3.0.0)

project(ProjName C CXX)

set(CMAKE_CONFIGURATION_TYPES "Release" CACHE STRING "Possible configurations" FORCE)
if (DEFINED CMAKE_BUILD_TYPE)
  set_property(CACHE CMAKE_BUILD_TYPE PROPERTY STRINGS ${CMAKE_CONFIGURATION_TYPES})
endif()

if(MSVC)
  # Dynamically link the runtime libraries
  add_compile_options(
    $<$:/MD>
    $<$:/MDd>
    $<$:/MD>
  )

  # 设置 VisualStudio 链接器->输入 链接库 “从父继承”
  set(CMAKE_CXX_STANDARD_LIBRARIES "$(CMAKE_CXX_STANDARD_LIBRARIES) %(AdditionalDependencies)")
endif()

set(CMAKE_INSTALL_PREFIX ${PROJECT_BINARY_DIR}/install)
set(CMAKE_ARCHIVE_OUTPUT_DIRECTORY_RELEASE  ${PROJECT_BINARY_DIR}/lib/)
set(CMAKE_RUNTIME_OUTPUT_DIRECTORY_RELEASE ${PROJECT_BINARY_DIR}/bin/)

add_definitions(-w)
if(MSVC)
#remove_definitions(/W3)
#add_definitions(/W4)
add_definitions(/wd4819)
add_definitions(/wd4244)
add_definitions(/wd4267)
add_definitions(-DNOMINMAX)
add_definitions(-D_CRT_SECURE_NO_WARNINGS)
endif()

include_directories()

# Visual Studio 分组
SOURCE_GROUP("" FILES ${Root_FILES})
SOURCE_GROUP(Dir\\SubDir FILES ${SubDir_FILES})

add_executable(foo)
target_link_libraries(foo ${LIBS})

# set_target_properties(foo PROPERTIES COMPILE_FLAGS "/Od")

# 添加 自定义PropertySheet文件 到 Visual Studio
set_target_properties(foo PROPERTIES VS_USER_PROPS "${PROJECT_SOURCE_DIR}/main.props")
# set_target_properties(foo PROPERTIES LINK_LIBRARIES "%(AdditionalDependencies)")

# Visual Studio 后期生成事件
ADD_CUSTOM_COMMAND(
  TARGET foo
  POST_BUILD
  COMMAND ${CMAKE_COMMAND} -E make_directory ${PROJECT_BINARY_DIR}
  COMMAND ${CMAKE_COMMAND} -E copy ${EXAMPLE_BIN_NAME} ${PROJECT_BINARY_DIR}/.
)
Building System for Visual Studio
example Windows Batch file config.bat
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
:: Create configuration for Visual Studio for Release and Debug modes.

@echo off

set ARCH=x64
set SRC_DIR=.\
set BUILD_DIR=%SRC_DIR%/build%ARCH%

echo.
echo PROCESSOR_ARCHITECTURE: %PROCESSOR_ARCHITECTURE%

echo.
echo ============= Checking for CMake ============
echo.

cmake --version
if not %errorlevel% == 0 (
    echo Error: CMake not found, please install it - see http://www.cmake.org/
    exit /B 1
)

echo.
echo ============= Creating build system for Visual Studio ============
echo.

if exist %BUILD_DIR% (
echo delete
rmdir /s %BUILD_DIR%
)

set MSVCDIR=%VS140COMNTOOLS%..\..\VC

:: Configure environment for Visual Studio
call "%MSVCDIR%\VCVARSALL.BAT" %ARCH%

set CMAKE_VS_GENERATOR=Visual Studio 14 2015 Win64
echo Using cmake generator %CMAKE_VS_GENERATOR%
set cmake_generator_options=-G "%CMAKE_VS_GENERATOR%"

if "%CMAKE_VS_GENERATOR_TOOLSET%" neq "" (
    echo Using cmake generator toolset %CMAKE_VS_GENERATOR_TOOLSET%
    set cmake_generator_options=%cmake_generator_options% -T "%CMAKE_VS_GENERATOR_TOOLSET%"
)

:: copy CMakeLists.txt ..

::set cmake_debug_options=--trace --debug-output
cmake -S %SRC_DIR% -B %BUILD_DIR% -Wno-dev %cmake_debug_options% %cmake_generator_options% || exit /B 1

echo.
echo ============= Building ============
echo.
echo To build:
echo - go to %BUILD_DIR%
echo - run 'cmake --build %BUILD_DIR% --parallel 3 --config Release(or Debug) [--target target_to_build]'
echo.

exit /B 0
Build
with cmake
from cmd with cmake --build
1
cmake --build %build-dir% --parallel 3 --config Release
with Visual Studio
打开 Visual Studio 工程 xxx.sln，生成解决方案
Ref
CMake projects in Visual Studio
Sharing project properties in Visual C++
使用cmake构建visual studio项目


求解方程数值近似解
2022-04-30T16:00:00.000Z
Solve \(\sqrt{x}\) ^[1]^[2]
Babylonian:巴比伦算法/牛顿法 ^[2]^[3]
1
2
3
4
5
6
// c++ code
double ans=1, pre=0;
while(abs(ans-pre)>1e-6){
    pre=ans;
    ans=(ans+x/ans)/2;
}
基于泰勒公式的级数逼近 ^[2]
在 线性化点 \(x_0=1\) 泰勒展开
\[\sqrt{x} \cong 1+\frac{1}{2}(x-1)-\frac{1}{4} \frac{(x-1)^{2}}{2 !}+\frac{3}{8} \frac{(x-1)^{3}}{3 !}-\frac{15}{16} \frac{(x-1)^{4}}{4 !}+\cdots\]
根据该公式我们可以在一定精度内逼近真实值，不过这个公式仍然存在一个问题，即是公式的收敛问题。
在泰勒级数展开中，平方根函数的公式当且仅当参数值位于一个有效范围内时才有效，在该范围内计算趋于收敛。该范围即是收敛半径，当我们对平方根函数用 \(x_0=1\) 进行计算时，泰勒级数公式希望x处于范围: \(0 之间。如果x在收敛半径之外，则展开式中的项会越来越大，泰勒级数离答案也就越来越远。为了解决该问题，我们可以考虑当待开平方数大于4时以4去除它，最后将得到的数乘以相同次数的2即可。
Code with Online Compiler
Reference
Methods of computing square roots  ↩
常用的平方根算法详解与实现  ↩
平方根-泰勒展开式求法  ↩


DBoW Note
2022-03-24T16:00:00.000Z
[TOC]
Overview
Bag of Words
BoW（Bag of Words，词袋模型），是自然语言处理领域经常使用的一个概念。一篇文章可能有一万个词，其中可能只有500个不同的单词，每个词出现的次数各不相同。词袋就像一个个袋子，每个袋子里装着同样的词。这构成了一种文本的表示方式。这种表示方式不考虑文法以及词的顺序。
DBoW
在计算机视觉领域，图像通常以特征点及其特征描述来表达。如果把特征描述看做单词，那么就能构建出相应的词袋模型。这就是本文介绍的DBoW2库所做的工作。利用DBoW2库，图像可以方便地转化为一个低维的向量表示。比较两个图像的相似度也就转化为比较两个向量的相似度。它本质上是一个信息压缩的过程。
DBoW算法，来源于西班牙的Juan D. Tardos课题组，用于解决 Place Recognition问题，ORB-SLAM、VINS-Mono等SLAM系统中的闭环检测模块均采用了该算法，主要是基于 词袋模型（BoW） https://en.wikipedia.org/wiki/Bag-of-words_model_in_computer_vision。
主要术语：
特征向量（FeatVec） - 单个视觉特征描述子
视觉单词 - 词典中的聚类中心，带有权重的单个视觉特征描述子
词袋向量（BowVec） - 一张图片用词袋中每个单词是否出现（+ 出现的次数 + TF-DF）组合而成的向量（体现多个视觉特征描述子）
主要过程：
构建字典（Vocabulary）：将图像数据库转换为索引图（k叉树）
近似最近邻（ANN）搜索：将一张图片中特征的描述子通过在k叉树种搜索转换为视觉单词（visual word），多个视觉单词组成词袋向量（BoW Vector）
离线步骤 - 构建视觉字典（聚类问题，也称为无监督分类）
主要采用 K-means算法，将用于训练的图像数据库中的视觉特征（DBoW3中支持ORB和BRIEF两种二进制描述子）归入k个簇（cluster）中，每一个簇通过其质心（centroid）来描述，聚类的质量通常可以用同一个簇的误差平方和（Sum of Squared Error，SSE）来表示，SSE越小表示同一个簇的数据点越接近于其质心，聚类效果也越好。这里的“接近”是使用距离度量方法来实现的，不同的距离度量方法也会对聚类效果造成影响（后面会提到）。
将训练图像数据库中所有N个描述子分散在一个k分支，d深度的k叉树的叶子节点上，如分支数为3，深度为 \(L_w\)，这样一个树结构有叶子结点 \(3^{L_w}\) 个。可以根据场景大小、需要达到的效果修改k和d的数值。这样query image进来检索时，可以通过对数时间的复杂度（d次 = logk N）找到其对应的聚类中心，而不是使用O(n)的时间复杂度的暴力检索。据统计，在10000张train image图像数据库中找到query image的匹配图像耗时<39ms，并有较高的召回率和较低的false positive。
然后，为了提高检索时的效率、成功率以及准确率，还采用了下述 权重计算算法
倒排索引（Inverse Index）
正排索引（Direct Index）
TF-IDF（Term Frequency - Inverse Document Frequency）
K-means优点是容易实现，缺点是在大规模数据集上收敛较慢，并且可能收敛到局部最小，造成该簇没有代表性。对于描述子这种高维空间的大规模聚类，粗暴使用K-means会有问题。因此会使用其变种Hierarchical K-means 或 K-means++。
实现过程
从训练图像中抽取特征
将抽取的特征用 k-means++ 算法聚类（使用汉明距离），将描述子空间划分成 k 类
将划分的每个子空间，继续利用 k-means++ 算法做聚类
重复\(L_w\)
\(L_w\) 次上述过程，将描述子建立成树形结构，如下图所示
k-means++代码在如下函数：
1
2
template<class TDescriptor, class F>
void TemplatedVocabulary::HKmeansStep(NodeId parent_id, const std::vector &descriptors, int current_level) {}
在线步骤 - 近似最近邻检索（ANN Retrieval）
由于ORB和BRIEF描述子均为二进制，因此距离度量采用汉明距离（二进制异或计算）。query image的描述子通过在字典的树上检索（找到最近邻的叶子节点）视觉单词，组成一个词袋向量（BoW vector），然后进行词袋向量之间的相似度计算，得到可能匹配的ranking images。最后还需要利用几何验证等方法选出正确（只是概率最大。。。）的那张图片。
BoW2 in ORB-SLAM2
ORB SLAM中, 在利用帧间所有特征点比对初始化地图点以后, 后面的帧间比对都采用Feature vector进行, 而不再利用所有特征点的descriptor两两比对。这样做的好处当然是 加快了处理速度, 但是信息再次被压缩抽象化, 不可避免会造成性能降低。 然而根据作者在之前的文章[1]及github上的描述, 对一幅图片的BOW特征抽取可以在5ms以内完成, 而在19000张图片构成的database中, 图片搜索可以在10ms内完成, 且保证False Positive为0。 具体的实验我没有进行验证, Whatever, ORB-SLAM证明了这样处理是有效的, 至少在数据集上, 及速度较慢的应用上, 可以实现令人满意的精度。
在作者使用的Bag of Word词典中, 词典是一个事先训练好的分类树, 而BOW特征有两种:
BowVector: 即是分类树中leaf的数值与权重
1
class BowVector:public std::map {};
FeatureVector: 是分类树中leaf的id值与对应输入ORB特征列表的特征序号
1
class FeatureVector:public std::mapunsigned int> > {};
FeatureVector 是为了加速ORB slam中SearchByBow操作，如下图，对于一个6层的字典，ORB slam中levelup设置为4，则level_2的nodeid将会成为FeatureVector中的Key，而每个Key对应一张图像上的若干feature在该帧的索引值。这样在进行两帧图像的特征点匹配的时候就可以只将相同Key值下的特征描述子暴力匹配而不是对两帧图像的所有特征进行暴力匹配，达到加速算法的效果。
 
其实处理非常简单, 在已经获得图像特征点集合的基础上, 再根据词典, 对每个特征做一次分类. 再对第二幅图像提取特征, 然后也根据词典, 也对这幅图像的所有特征进行分类. 用分类后的特征类别代替原本的特征descriptor, 即用一个数字代替一个向量进行比对, 显然速度可以大大提升。
正向索引，用于加速匹配
主要在以下几个函数
1
2
3
4
5
6
7
// Used in TrackReferenceKeyFrame, Relocalisation and Loop Closing
int SearchByBoW(KeyFrame *pKF, Frame &F, std::vector &vpMapPointMatches);
int SearchByBoW(KeyFrame *pKF1, KeyFrame* pKF2, std::vector &vpMatches12);

// Used in LocalMapping
int SearchForTriangulation(KeyFrame *pKF1, KeyFrame* pKF2, 
 cv::Mat F12, std::vectorsize_t, size_t> > &vMatchedPairs, const bool bOnlyStereo);
核心代码如下所示：
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
DBoW2::BowVector bow_vec0, bow_vec1;
DBoW2::FeatureVector feat_vec0, feat_vec1;

// Feature vector associate features with nodes in the 4th level (from leaves up)
// We assume the vocabulary tree has 6 levels, change the 4 otherwise
voc_ptr->transform(to_descriptor_vector(descriptors0), bow_vec0, feat_vec0, 4);
voc_ptr->transform(to_descriptor_vector(descriptors1), bow_vec1, feat_vec1, 4);

DBoW2::FeatureVector::const_iterator f0it = feat_vec0.begin();
DBoW2::FeatureVector::const_iterator f1it = feat_vec1.begin();
DBoW2::FeatureVector::const_iterator f0end = feat_vec0.end();
DBoW2::FeatureVector::const_iterator f1end = feat_vec1.end();

while (f0it != f0end && f1it != f1end) {
  if (f0it->first == f1it->first) {
    for (size_t i0 = 0, iend0 = f0it->second.size(); i0 < iend0; i0++) {
      size_t idx0 = f0it->second[i0];
      const cv::Mat &d0 = descriptors0.row(idx0);

      int bestDist = TH_LOW;
      int bestIdx1 = -1;

      for (size_t i1 = 0, iend1 = f1it->second.size(); i1 < iend1; i1++) {
        size_t idx1 = f1it->second[i1];
        const cv::Mat &d1 = descriptors1.row(idx1);

        const int dist = descriptor_distance(d0, d1);

        if (dist > TH_LOW || dist > bestDist) continue;

        if (dist < bestDist) {
          bestDist = dist;
          bestIdx1 = idx1;
        }
      }
// ...
这种加速特征匹配的方法在ORB-SLAM2中被大量使用
正向索引的层数如果选择第0层（根节点），那么时间复杂度和暴力搜索一样
如果是叶节点层，则搜索范围有可能太小，错失正确的特征点匹配
作者一般选择第二层或者第三层作为父节点（L=6），正向索引的复杂度约为O(N^2/Km)
逆向索引，用于回环和重定位
作者用反向索引记录每个叶节点对应的图像编号。当识别图像时，根据反向索引选出有着公共叶节点的备选图像并计算得分，而不需要计算与所有图像的得分。
使用词袋模型，在重定位过程中找出和当前帧具有公共单词的所有关键帧，在 KeyFrameDatabase::DetectLoopCandidates() 函数中代码如下：
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
// Search all keyframes that share a word with current keyframes
// Discard keyframes connected to the query keyframe
{
    unique_lock lock(mMutex);

    for (DBoW2::BowVector::const_iterator vit = pKF->mBowVec.begin(), vend = pKF->mBowVec.end(); vit != vend; vit++) {
        list& lKFs = mvInvertedFile[vit->first];

        for (list::iterator lit = lKFs.begin(), lend = lKFs.end(); lit != lend; lit++) {
            KeyFrame* pKFi = *lit;
            if (pKFi->mnLoopQuery != pKF->mnId) {
                pKFi->mnLoopWords = 0;
                if (!spConnectedKeyFrames.count(pKFi)) {
                    pKFi->mnLoopQuery = pKF->mnId;
                    lKFsSharingWords.push_back(pKFi);
                }
            }
            pKFi->mnLoopWords++;
        }
    }
}
改进
通过一些online learning的方法进行词典学习
把特征descriptor转为二进制表达, 加速运算
训练自己的特征
应用
SLAM中的 回环检测
Ref
DBoW2库介绍
Introduction to BoW algorithm and DBoW2 library
Simple bag-of-words loop closure for visual SLAM
ORBSLAM2学习（四）：DBoW2源码分析（OrbVocabulary部分）
ORB-SLAM中BOW特征匹配
DBoW和KeyFrameDatabase使用记录


B-Spline for SLAM
2022-03-20T16:00:00.000Z
Overview
to convert a set of trajectory points into a continuous-time uniform cubic cumulative b-spline
BSpline.ipynb on Google Colaboratory
Uniform Cubic B-Splines in \(\mathbb{SE(3)}\)
Continuous and Discrete Time
离散时间通过IMU运动学模型，进行IMU预积分
跟Cam时间戳对齐
联合优化
连续时间Cam位姿BSpline拟合连续位姿
位姿微分
Applications
路径规划 轨迹优化
https://zhuanlan.zhihu.com/p/159192419
连续时间轨迹估计
https://jishuin.proginn.com/p/763bfbd6f25e
💡 连续时间的轨迹表示方法： 1）离散时间的轨迹优化方法是直接针对机器人的离散时间的轨迹点来进行位姿图优化。 2）连续时间的轨迹优化方法不直接对机器人的轨迹点进行优化，而是将机器人的轨迹用B样条来拟合，并且通过调整B样条的控制点的位置使得轨迹尽可能贴合观测。
本文采用两组B样条控制点，分别拟合机器人的旋转与平移在时间上的轨迹。
IMU插值与数据仿真
Kitti IMU 样条插值
IMU数据仿真
https://docs.openvins.com/classov__core_1_1BsplineSE3.html
https://matheecs.tech/study/2019/06/23/BSpline.html
从已有轨迹生成imu数据推导
Unsynchronized multi-sensor intrinsic and extrinsic least-squares calibration
IMU与Cam的相机外参标定
ref:
https://zhuanlan.zhihu.com/p/68863677
camera-imu外参标定大体上分为三步：
粗略估计camera与imu之间时间延时。利用camera的样条曲线获取任意时刻camera旋转角速度，而陀螺仪又测量imu的角速度
现在利用两个曲线的相关性，可以粗略估计imu和camera时间延时
获取imu-camera之间初始旋转，还有一些必要的初始值：重力加速度、陀螺仪偏置。
大优化，包括所有的角点重投影误差、imu加速度计与陀螺仪测量误差、偏置随机游走噪声。


TSDF Overview
2022-03-12T16:00:00.000Z
[TOC]
Overview
SDF (Signed Distance Function) 描述的是点到面的距离，在面上为0，在面的一边为正，另一边为负。
TSDF (Truncated SDF) 只考虑面的邻域内的SDF值，邻域的最大值是max truncation的话，则实际距离会除以max truncation这个值，达到归一化的目的，所以TSDF的值在-1到+1之间。
算法逻辑
TSDF 模型将整个待重建的三维空间划分成网格，每个网格中存储了数值，网格模型中值的大小代表网格离重建好的表面的距离。
如下图表示的是重建的一个人的脸（网格模型中值为 0 的部分，红线表示重建的表面，示意图给出的二维信息，实际是三维的），重建好的表面到相机一侧都是正值，另一侧都是负值，网格点离重建好的表面距离越远绝对值越大，在网格模型中从正到负的穿越点表示重建好的场景表面。
我们将整个空间的体素全部存入GPU运算，每个线程处理一条(x,y)。即对于(x,y,z)的晶格坐标，每个GPU进程扫描处理一个(x,y)坐标下的晶格柱。
对于每个x,y坐标下的体元g，并行的从前往后扫描
将晶格坐标g转换到对应的世界坐标系点vg
对于每次TSDF操作时的拍摄变换Ti反变换到对应的相机坐标系坐标v
相机坐标系点v投影到图像坐标点p，从3D到2D
如果v在此摄像机的投影范围内，用它修正现有tsdf表示
sdfi是该相机坐标系点vg到本次相机原点ti的距离与本次观测深度Di(p)的差值
8-11为截断的过程，Truncated的意义所在，用max truncation表示选取的截断范围，此值将会关系到最后重建结果的精细程度
如果差值为正，表示该晶格在本次测量的面的后面
tsdfi赋值【0,1】之间，越靠近观测面的地方值越接近0
如果差值为负，表示该晶格在本次测量的面的前面
tsdfi赋值【-1，0】之间，越靠近观测面的地方值越接近0
选取本次计算值的tsdf的权值wi，这个权值的选取直接关系到图片的适应性，以及抗噪声的能力，其实这里有点类似卡尔曼滤波。注意这里每次权值+1的操作基于这样的原因，由于只有在相机拍摄范围内的点才会进入求tsdf的操作，每次的权值在原先的基础上增加1能照顾到迅速变化的或很少扫描到的面的变化。
加权平均求出tsdfavg
将wi和tsdfavg存储在对应的晶格，进行下个晶格的扫描操作 经过上面的扫描，最终立方体晶格中存储的tsdf值形成了重建物体外是负值，物体内部是正值，物体表面是0值得形式（可能没有准确的零值，但是可以根据正负值插值求出零值点，所以最后物体表面的分辨率将会超过晶格的分辨率）
Example: MobileFusion
建立长方体包围盒，并划分网格
要建立一个长方体包围盒，让所有的三维点都在这个长方体里面。
假设z方向垂直相机，则x,y方向上的极值就是图像的边界。图像的边界点是就是四个角 (0,0),(w,0),(0,h),(w,h)，z方向上深度范围是0~max_depth，组合而成的边界点就是（0，0，0），（0，0，max_depth）,(w,0,0)等的2^3=8种情况，然后把这些点用相机的内参和外参换算到世界坐标系中，长方体的极点。
在长方体内部划分网格，比如说我们现在求得的长方体的极点分别是（-1，-1，-1），（1，1，1），单位是米。我们要在这个长方体内部划分网格，就是分割出一个个等体积的小的立方体，也就是所谓的体素。我们让体素的边长是0.02，也就是2厘米。那么从-1到1，我们可以划分出100个体素，也就是说这个长方体上每个小立方体的8个顶点的坐标可以用（x,y,z）来表示，其中x,y,z都是0-100之间的，同时它们的世界坐标也可以通过（-1+0.02x,-1+0.02y,-1+0.02*z）来计算出来。
迭代更新tsdf网格
遍历每一组数据（RGB图、深度图、pose.txt），每次把这个长方体内的所有格点的世界坐标通过逆变换到相机坐标，再投影到图片上。
将图片上对应位置的深度与格点的在相机坐标系下的深度比较
\[\text{depth-diff} = \text{depth-val} - \text{cam-pts}[2,:]\]
如果 \(\|\text{depth-diff}\| < \text{trunc-marin}\) 则认为有效。
用 \(\text{dist} = \text{depth-diff} / \text{trunc-marin}\) 去加权更新tsdf网格。
tsdf网格每个顶点存放的是dist的加权和。
找等值面
用marching cubes算法在tsdf网格中寻找dist加权和为0的等值面，就是物体表面。
Related Projects
https://github.com/andyzeng/tsdf-fusion
https://github.com/andyzeng/tsdf-fusion-python 🚩
http://www.open3d.org/docs/0.12.0/tutorial/pipelines/rgbd_integration.html#TSDF-volume-integration


EKF v.s. MSCKF v.s. MAP (VINS-Mono)
2022-03-07T16:00:00.000Z
[TOC]
Overview
\[z = f(x) + n, \quad n \sim \mathcal{N}(0, \Sigma), \quad z \sim \mathcal{N}(f(x), \Sigma)\]
\[P(z \mid x) = \mathcal{N}(z; f(x), \Sigma) = \eta \exp \left(-\frac{1}{2}(z-f(x))^{T} {\Sigma}^{-1}(z-f(x))\right)\]
EKF
true state
\[\begin{aligned}x_k &= f(x_{k-1}, w_{k-1}) \\z_k &= h(x_k, v_k)\end{aligned}\]
norminal state
\[\begin{aligned}\bar{x}_k &= f(\hat{x}_{k-1}, 0) \\\bar{z}_k &= h(\bar{x}, 0)\end{aligned}\]
true state linearization
\[\begin{aligned}x_k &\approx \bar{x}_k + A(x_{k-1} - \hat{x}_{k-1}) + W w_{k-1} \\z_k &\approx \bar{z}_k + H(x_k - \bar{x}_k) + V v_k\end{aligned}\]
prediction error & measurement residual
\[\begin{aligned}e_x &\equiv x_k - \bar{x}_k \approx A(x_{k-1} - \hat{x}_{k-1}) + W w_{k-1} \\e_z &\equiv z_k - \bar{z}_k \approx H e_x + V v_k\end{aligned}\]
jacobian
\[A = \frac{\partial f}{\partial x}, \quadW = \frac{\partial f}{\partial w}\]
\[H = \frac{\partial h}{\partial x}, \quadV = \frac{\partial h}{\partial v}\]
covariance
\[w \sim \mathcal{N}(0, Q), \quad v \sim \mathcal{N}(0, R)\]
\[e_x \sim \mathcal{N}(0, P) \rightarrow x \sim \mathcal{N}(\bar{x}, P), \quad e_z \sim \mathcal{N}(0, S)\]
\[P = \texttt{cov}(e_x) = E(e_x e_x^T), \quad S = \texttt{cov}(e_z) = E(e_z e_z^T)\]
Prediction
state prediction (w/o noise)
\[\hat{x}_k^- = f(\hat{x}_{k-1}, 0)\]
(error state) covariance
\[\text{cov}(x_k - \hat{x}_k^-) = P_k^- = A_k P_{k-1} A_k^T + W_k Q_{k-1} W_k^T\]
Update
Kalman gain
\[K_k = P_k^- H_k^T S^{-1}, \quad S = H_k P_k^- H_k^T + V_k R_k V_k^T\]
state update
\[\hat{x}_k = \hat{x}_k^- + K_k (z_k - h(\hat{x}_k^-, 0))\]
covariance update
\[\text{cov}(x_k - \hat{x}_k) = P_k = (I - K_k H_k) P_k^-\]
MSCKF
Prediction
state prediction (state prior)
\[PVQ\]
propagate error cov P
continuous-time to discret-time，离散时间 状态转移矩阵和噪声协方差矩阵 比较准确，例如
\[F = \exp(A \Delta t) \approxI + A \Delta t + \frac{1}{2} (A \Delta t)^2 + \frac{1}{6} (A \Delta t)^3\]
误差状态的概率分布
\[\delta x \sim \mathcal{N}(\hat{\delta x}, P), \quad n \sim \mathcal{N}(0, Q)\]
误差协方差传播（整个系统过程）
\[\delta x_{i+1} = F \delta x_i + G n, \quad P = F P F^T + G Q G^T\]
Update (ESKF)
\[\delta x \sim \mathcal{N}(\hat{\delta x}, P)\]
\[z = h(x) + v \approx h(x_0) + H \delta x + v, \quad v \sim \mathcal{N}(0, R)\]
predicted residual (innovation)
\[r = z - h(x_0) = H \delta x + v\]
then, the covariance of innovation
\[cov(r, r) = E(rr^T) = E(H \delta x \delta x^T H^T + vv^T) = HPH^T + R\]
update state and covariance
\[x = K r\]
\[P = (I-KH)P\]
MAP (VINS-Mono)
Prediction
state prediction (state prior)
\[PVQ\]
pre-integration (propagate error cov P & state)
continuous-time to discret-time
\[F = \exp(A \Delta t) \approx I + A \Delta t\]
误差状态的概率分布
\[\delta x \sim \mathcal{N}(0, P), \quad n \sim \mathcal{N}(0, Q)\]
误差状态（状态预积分）和协方差传播（图像k时刻初始，图像k～图像k+1）
\[\delta x_{i+1} = F \delta x_i + G n, \quad P = F P F^T + G Q G^T\]
Update (MAP)
Jacobian & information matrix in MAP
\[J = \frac{\partial r}{\partial \delta x}\]
IMU
协方差矩阵（信息矩阵的逆）
\[\text{cov}(\delta x_k) = P\]
Cam
协方差矩阵（信息矩阵的逆）
\[\text{cov}(r_k) = \Sigma_{\pi}\]
update state
\[x = x + \delta x\]
QA
Jacobi when Linear and Nonlinear
欧式空间的非线性方程
\[h(x) \approx h(x_0) + H \Delta x, \quad \left. H = \frac{\partial h(x)}{\partial x} \right|_{x = x_0}\]
当 \(h(x)\) 线性时
\[h(x) = Hx\]
Jacobi w.r.t Error or True State
\[f(x_0 \oplus \Delta x) = F(\Delta x)\]
\[f(x_0 \oplus \Delta x) \approx f(x_0) + \left. \frac{\partial f(x_0 \oplus \Delta x)}{\partial x} \right|_{x=x_0} \Delta x\]
\[F(\Delta x) \approx F(0) + \left. \frac{\partial F(\Delta x)}{\partial \Delta x} \right|_{\Delta x = 0} \Delta x =f(x_0) +\left. \frac{\partial f(x_0 \oplus \Delta x)}{\partial \Delta x} \right|_{\Delta x = 0} \Delta x\]
当x在欧式空间时，上式等价。
ref: https://zhuanlan.zhihu.com/p/75714471
Jacobi in EKF & MAP
优化变量 是 什么状态，对应的 雅克比 即是 对什么状态 求导
EKF
w.r.t true state w.r.t error state
measurement function \(h(x)\) \(h(\Delta x)\)
Jacobi \(\frac{\partial h(x)}{\partial x}\) \(\frac{\partial h(x)}{\partial \Delta x}\)
init state \(x = x_0\) \({\Delta x}=0\)
update \(x \oplus \Delta x, \Delta x = Kr\) \(\Delta x = Kr\)
MAP
w.r.t true-state w.r.t error-state
cost function \(f(x)\) \(f(\Delta x)\)
Jacobi \(\frac{\partial f(x)}{\partial x}\) \(\frac{\partial f(x)}{\partial \Delta x}\)
init state \(x = x_0\) \({\Delta x}=0\)
iteration update \(x \oplus \delta x\) \(\Delta x \oplus \delta \Delta x\)


SLAM中的3D特征参数化表示
2022-03-05T16:00:00.000Z
[TOC]
Overview
一般表示形式
优化形式（避免过参数化）
特征的参数化表示，即以何种方式表示特征，在优化中决定了特征以何种参数进行的迭代更新，或者 在EKF中决定了以何种参数构建高斯模型。不论在优化还是EKF中，我们关心的都是特征在图像上的投影与特征参数之间的关系（Jacobian）。
过参数化(Overparameterization) 问题
特征参数化之后参数的个数大于实际表示的 自由度 的表现形式就被称为 过参数化
3D Point
\[P: \left[ X \; Y \; Z \right]^T\]
3 DoF
ref:
VSLAM中特征点的参数化表示
https://docs.openvins.com/update-feat.html#feat-rep
XYZ
Global XYZ
Anchored XYZ
Inverse Depth
Global inverse depth (spherical coordinates)has a singularity when the z-distance goes to zero
球坐标逆深度仅在xyz都趋近于0时才存在数值奇异，所以能用在全局坐标系
Anchored inverse depth (MSCKF)逆深度 + normalized UV (Bearing Vector)
Anchored inverse depth (MSCKF single depth)逆深度
the the single depth from VINS-Mono
3D Line
\[L: \left[ P_0 \; P_1 \right]\]
4 DoF
ref:
SLAM中线特征的参数化和求导
普吕克(Plucker)坐标
过参数化 问题
正交表示法
3D Plane
\[Ax + By + Cz + D = 0\]
3 DoF
ref:
SLAM中面特征的参数化
Hesse
用一个平面的 单位法向量 和 平面距离原点的 距离 来表示
\[\pi: \left(\mathbf{n}^{\top}, d\right)^{\top} \in \mathbb{R}^4\]
过参数化 问题：平面的Hesse形式的过参数化就是因为单位法向量部分有3个参数，但是实际只有两个自由度导致的。
球坐标
单位法向量可以被看成是一个单位圆球上的一点，那么就可以用两个角度 \(\theta\) 和 \(\phi\) 来参数化这个点，从而表示出这个单位法向量。
\[\pi: \left[ \theta \; \phi \; d \right] \in \mathbb{R}^3\]
\[\tau=(\theta, \phi, d)^{\top}=q(\pi)=\left(\theta=\arctan \frac{n_{y}}{n_{x}}, \; \phi=\arcsin n_{z}, \; d\right)^{\top}\]
切平面
GTSAM里面表示面的方式用切平面的方式来更新单位法向量，同样也可以被用来优化单位法向量。
最近点
单位四元数
退化二次曲面 ^[1]
Unified Representation
paper ^[1]
Reference
Unified Representation of Heterogeneous Sets of Geometric Primitives  ↩

	w.r.t true state	w.r.t error state
measurement function	\(h(x)\)	\(h(\Delta x)\)
Jacobi	\(\frac{\partial h(x)}{\partial x}\)	\(\frac{\partial h(x)}{\partial \Delta x}\)
init state	\(x = x_0\)	\({\Delta x}=0\)
update	\(x \oplus \Delta x, \Delta x = Kr\)	\(\Delta x = Kr\)

	w.r.t true-state	w.r.t error-state
cost function	\(f(x)\)	\(f(\Delta x)\)
Jacobi	\(\frac{\partial f(x)}{\partial x}\)	\(\frac{\partial f(x)}{\partial \Delta x}\)
init state	\(x = x_0\)	\({\Delta x}=0\)
iteration update	\(x \oplus \delta x\)	\(\Delta x \oplus \delta \Delta x\)

CGABC

When SLAM meets XR and Robotics

When SLAM meets XR

When SLAM meets Robotics

GPG Overview

Overview

GPG

生成密钥

生成子密钥

撤销证书

列出本地密钥

导出密钥

public key

Private Key

删除本地密钥

公钥服务器

公钥指纹

导入密钥

从文件import

从公钥服务器上获取公钥:

Keybase

应用

文件验证

信任

git

用来代替SSH

apt-get

移动机器人常见底盘及其运动学

Overview [1]

两轮差速运动模型 (Differential Drive robot)

Forward Kinematic Model [3]

Backward Kinematic Model

四轮阿克曼底盘 (Four-wheeled Ackerman robot)

四轮滑移底盘（Four-wheel sliding robot）

四轮驱动四轮转向机器人

全向移动底盘（omnidirectional wheel robot）

双履带式

Reference

On-Manifold Optimization: Local Parameterization

Overview

Manifold Space vs Tangent Space

Jacobian w.r.t Error State

Jacobian w.r.t Error State vs True State

ESKF [2] 6.1.1: Jacobian computation

Least Squares on a Manifold [3]

Local Parameterization in Ceres Solver [4] [5] [6] [7] [8]

Plus

ComputeJacobian

\(r\) w.r.t \(x_{L}\)

Sub Class

自定义 QuaternionParameterization

Summary

Reference

Mapillary API at a Glance

Embed images [1]

Reference

Observability and Inconsistency in a Nutshell

Overview

Basics

Nullspace [1]

Lie Derivative [2]

Observability Analysis [4]

Unobservable DoF (Gauge Freedom) in SLAM

Observability Matrix

Observability Matrix vs Hessian(Information) Matrix

NEES (normalized estimation error squared)

Inconsistency of Estimator

Degeneracy (Insufficient Restraint) / Inconsistency in SLAM

Motion

Structure

Maintain(Solve) Consistency(Inconsistency)

FEJ (First-Estimate Jacobians)

FEJ2

Observability Constraint (OC)-VINS

App: OC-MSC-KF

Gauge Freedom Handling

FAQ

Reference

PyTorch on Ubuntu 18.04

Install PyTorch locally

Overview ^[1]

Forward Kinematic Model ^[3]

ESKF ^[2] 6.1.1: Jacobian computation

Least Squares on a Manifold ^[3]

Local Parameterization in Ceres Solver ^[4] ^[5] ^[6] ^[7] ^[8]

Embed images ^[1]

Nullspace ^[1]

Lie Derivative ^[2]

Observability Analysis ^[4]

`运动估计 ICP`

`Mapping（低频率，精定位）`

`基于Cube的地图管理`

`特征匹配 (Scan-Map)`

`correspondence for corner features`

`correspondence for plane features`

`运动估计 ICP`

`地图增长`

Init `MeshInfo`

Create `VertexInfo`

Update `VertexInfo`

3. Mesh + MeshInfo --> Adjacency Graph (`UniGraph`)

Calculate `FaceProjectionInfo`

Graph `mapmap::Graph`

LabelSet `mapmap::LabelSet`