Ceph

From Omnia
Jump to navigation Jump to search


Subpage Table of Contents


Ceph

Hardware Recommendations

hardware recommendations — Ceph Documentation
https://docs.ceph.com/en/quincy/start/hardware-recommendations/

Status

ceph status
# OR: ceph -s

Example:

# ceph status
  cluster:
    id:     ff74f760-84b2-4dc4-b518-8408e3f10779
    health: HEALTH_OK

  services:
    mon: 3 daemons, quorum vm-05,vm-06,vm-07 (age 12m)
    mgr: vm-07(active, since 47m), standbys: vm-06, vm-05
    mds: 1/1 daemons up, 2 standby
    osd: 3 osds: 3 up (since 4m), 3 in (since 4m)

  data:
    volumes: 1/1 healthy
    pools:   4 pools, 97 pgs
    objects: 3.68k objects, 13 GiB
    usage:   38 GiB used, 3.7 TiB / 3.7 TiB avail
    pgs:     97 active+clean

  io:
    client:   107 KiB/s rd, 4.0 KiB/s wr, 0 op/s rd, 0 op/s wr

Health

Health summary:

osd health
# good health:
HEALTH_OK
# bad health:
HEALTH_WARN Reduced data availability: 47 pgs inactive, 47 pgs peering; 47 pgs not deep-scrubbed in time; 47 pgs not scrubbed in time; 54 slow ops, oldest one blocked for 212 sec, daemons [osd.0,osd.1,osd.2,osd.5,osd.9,mon.lmt-vm-05] have slow ops.

Health details:

osd health detail
# good health:
HEALTH_OK
# bad health:
HEALTH_WARN 1 osds down; 1 host (1 osds) down; Reduced data availability: 47 pgs inactive, 47 pgs peering; 47 pgs not deep-scrubbed in time; 47 pgs not scrubbed in time; 49 slow ops, oldest one blocked for 306 sec, daemons [osd.0,osd.1,osd.2,osd.5,osd.9,mon.prox-05] have slow ops.
[WRN] OSD_DOWN: 1 osds down
    osd.5 (root=default,host=prox-06) is down
[WRN] OSD_HOST_DOWN: 1 host (1 osds) down
    host prox-06 (root=default) (1 osds) is down
[WRN] PG_AVAILABILITY: Reduced data availability: 47 pgs inactive, 47 pgs peering
    pg 3.0 is stuck peering for 6m, current state peering, last acting [3,5,4]
    pg 3.3 is stuck peering for 7w, current state peering, last acting [5,1,0]
...

Watch

Watch live changes:

ceph -w

OSD

List OSDs

volume lvm list

Note: only shows local OSDs..

ceph-volume lvm list

Example:

====== osd.0 =======

  [block]       /dev/ceph-64fda9eb-2342-43e3-bc3e-78e5c1bcda31/osd-block-ff991dbd-7698-44ab-ad90-102340ec05c7

      block device              /dev/ceph-64fda9eb-2342-43e3-bc3e-78e5c1bcda31/osd-block-ff991dbd-7698-44ab-ad90-102340ec05c7
      block uuid                uvsm7p-c9KU-iaVe-GJGv-NBRM-xGrr-XPf3eB
      cephx lockbox secret
      cluster fsid              ff74f760-84b2-4dc4-b518-8408e3f10779
      cluster name              ceph
      crush device class
      encrypted                 0
      osd fsid                  ff991dbd-7698-44ab-ad90-102340ec05c7
      osd id                    0
      osdspec affinity
      type                      block
      vdo                       0
      devices                   /dev/fioa

[1]

osd tree

ceph osd tree

Example:

ID  CLASS  WEIGHT   TYPE NAME           STATUS  REWEIGHT  PRI-AFF
-1         3.69246  root default
-3         1.09589      host vm-05
 0    ssd  1.09589          osd.0           up   1.00000  1.00000
-7         1.09589      host vm-06
 2    ssd  1.09589          osd.2         down         0  1.00000
-5         1.50069      host vm-07
 1    ssd  1.50069          osd.1           up   1.00000  1.00000

List down tree OSD nodes: [2]

ceph osd tree down

osd stat

ceph osd stat

osd dump

ceph osd dump

Mark OSD Online (In)

 ceph osd in [OSD-NUM]

Mark OSD Offline (Out)

 ceph osd out [OSD-NUM]

Deleted OSD

First mark it out:

ceph osd out osd.{osd-num}

Mark it down:

ceph osd down osd.{osd-num}

Remove it:

ceph osd rm osd.{osd-num}

Check tree for removal:

ceph osd tree

---

If you get an error that it is busy.. [3]

Go to host that has the OSD and stop the service:

systemctl stop ceph-osd@{osd-num}

Remove it again:

ceph osd rm osd.{osd-num}

Check tree for removal:

ceph osd tree

If 'ceph osd tree' reports 'DNE (do not exist), then do the following...

Remove from the CRUSH:

ceph osd crush rm osd.{osd-num}

Clear auth:

ceph auth del osd.{osd-num}.

ref: [4]

Create OSD

Create OSD:[5]

pveceph osd create /dev/sd[X]

If the disk was in use before (for example, for ZFS or as an OSD) you first need to zap all traces of that usage:

ceph-volume lvm zap /dev/sd[X] --destroy

Create OSD ID:

ceph osd create
 # will generate the next ID in sequence

Create directory:

mount -o user_xattr /dev/{hdd} /var/lib/ceph/osd/ceph-{osd-number}

Init data directory:

ceph-osd -i {osd-num} --mkfs --mkkey

Register:

ceph auth add osd.{osd-num} osd 'allow *' mon 'allow rwx' -i /var/lib/ceph/osd/ceph-{osd-num}/keyring

Add to CRUSH map:

ceph osd crush add {id-or-name} {weight}  [{bucket-type}={bucket-name} ...]

POOL

Pool Stats

ceph osd pool stats

References

keywords