Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

port-forward for grafana hangs indefinitely: Grafana webpage not accessible #15

Open
alehanderoo opened this issue Jun 4, 2024 · 5 comments

Comments

@alehanderoo
Copy link

Hi @geerlingguy,

First of all, thank you for open-sourcing this!
I’ve learned a lot about Ansible and server configuration over the last few days (and nights)!
What a fantastic tool!

Describe the bug

  • When I log into my control_plane node, set sudo su, copy the config with : cp /etc/rancher/k3s/k3s.yaml ~/.kube/config
  • Then nano ~/.kube/config
    I changed the 127.0.0.1 to 192.168.2.52 (my wlan0 of the control_plane on which drupal is accessible from my workstation)
apiVersion: v1
clusters:
- cluster:
    certificate-authority-data: 
      xxx
server: https://92.168.2.52:6443
  name: default
contexts:
- context:
    cluster: default
    user: default
  name: default
current-context: default
kind: Config
preferences: {}
users:
- name: default
  user:
    client-certificate-data:
    xxx
client-key-data: 
   xxx

When I then run kubectl port-forward service/cluster-monitoring-grafana :80 (as user and as root) the device does not finish the command and grafana is never accessible.

root@node1:/home/rock# kubectl port-forward service/cluster-monitoring-grafana :80
Forwarding from 127.0.0.1:46238 -> 3000
Forwarding from [::1]:46238 -> 3000

Opening http://192.168.2.52:46238/ does not return a page.

Troubleshooting

I'm running a self-built cluster.
Control_plane on a rockpi4:

   Static hostname: rockpi4c
       Operating System: Ubuntu 20.04.6 LTS
            Kernel: Linux 4.4.154-112-rockchip-gfdb18c8bab17
      Architecture: arm64

Remaining 4 nodes: (Rpi4 and Rpi3)

Operating System: Debian GNU/Linux 12 (bookworm)
          Kernel: Linux 6.6.31+rpt-rpi-v8
    Architecture: arm64

Networking:

  • I needed to update the networking.yml so my nodes got internet through wlan0 of the rockpi -> this works
  • my configure_routing.yml file for reference (I run this playbook prior to running main.yml):
---
- name: Set up static networking configuration.
  hosts: cluster
  gather_facts: false
  become: true
  vars_files:
  - config.yml
  tasks:
    - name: Configure hosts file so nodes can see each other by hostname.
      ansible.builtin.blockinfile:
        path: /etc/hosts
        marker: "# ANSIBLE MANAGED - static ip config {mark}"
        block: |
          {% for host in groups['cluster'] %}
          {{ ipv4_subnet_prefix }}.{{ hostvars[host].ip_host_octet }} {{ host }} {{ host | regex_replace('\.local', '') }}
          {% endfor %}
        insertafter: EOF


- name: Configure Control Plane (Node1)
  hosts: control_plane
  become: true

  handlers:
    - name: restart dnsmasq
      ansible.builtin.service:
        name: dnsmasq
        state: restarted

    - name: persist iptables rules
      ansible.builtin.command: netfilter-persistent save
      
  tasks:
    - name: Install routing prerequisites.
      ansible.builtin.apt:
        name:
          - dnsmasq
          - netfilter-persistent
          - iptables-persistent
        state: present

    - name: Ensure netfilter-persistent is enabled.
      ansible.builtin.service:
        name: netfilter-persistent
        enabled: true

    - name: Ensure dnsmasq is running and enabled.
      ansible.builtin.service:
        name: dnsmasq
        state: started
        enabled: true

    - name: Enable IPv4 forwarding.
      ansible.posix.sysctl:
        name: net.ipv4.ip_forward
        value: '1'
        sysctl_set: yes

    - name: Remove default route via eth0
      command: ip route del default via 192.168.3.254 dev eth0
      ignore_errors: yes

    - name: Add default route via wlan0 with correct metric
      command: ip route add default via 192.168.2.254 dev wlan0 metric 100
      ignore_errors: yes

    - name: Flush existing NAT rules
      command: iptables -t nat -F
    - name: Flush existing NAT rules
      command: sudo iptables -F FORWARD

    - name: Set up NAT for wlan0
      ansible.builtin.iptables:
        table: nat
        chain: POSTROUTING
        jump: MASQUERADE
        out_interface: wlan0
        source: 192.168.3.0/24
      notify: persist iptables rules

    - name: Ensure FORWARD chain allows traffic between interfaces
      ansible.builtin.iptables:
        table: filter
        chain: FORWARD
        jump: ACCEPT
        in_interface: eth0
        out_interface: wlan0
        source: 192.168.3.0/24
        ctstate: NEW,ESTABLISHED,RELATED
      notify: persist iptables rules

    - name: Ensure FORWARD chain allows returning traffic
      ansible.builtin.iptables:
        table: filter
        chain: FORWARD
        jump: ACCEPT
        in_interface: wlan0
        out_interface: eth0
        ctstate: ESTABLISHED,RELATED
      notify: persist iptables rules
    
    - name: Configure dnsmasq for bridged DNS.
      ansible.builtin.copy:
        dest: /etc/dnsmasq.d/bridge.conf
        content: |
          interface=eth0
          bind-interfaces
          server=1.1.1.1
          server=1.0.0.1
          domain-needed
          bogus-priv
      notify: restart dnsmasq

    # See: https://github.com/geerlingguy/turing-pi-2-cluster/issues/9
    - name: Add crontab task to restart dnsmasq.
      ansible.builtin.cron:
        name: "restart dnsmasq if not running"
        minute: "*"
        job: "/usr/bin/systemctl status dnsmasq || /usr/bin/systemctl restart dnsmasq"



- name: Configure Nodes
  hosts: nodes
  become: true
  tasks:
    - name: Remove the incorrect default gateway
      command: ip route del default via 192.168.3.254 dev eth0
      ignore_errors: yes

    - name: Set the correct default gateway
      command: ip route add default via 192.168.3.69
      ignore_errors: yes

    - name: Ensure DNS configuration
      lineinfile:
        path: /etc/resolv.conf
        line: 'nameserver 8.8.8.8'
        create: yes
        state: present

    - name: Ping google.com to check connectivity
      ansible.builtin.shell: |
        ping -c 4 google.com | grep 'time=' || echo "Ping failed"
      register: ping_test_result
      changed_when: false
      failed_when: ping_test_result.rc != 0 or not 'ms' in ping_test_result.stdout

    - name: Display ping test result
      debug:
        msg: "{{ ping_test_result.stdout }}"

Main installation:

  • Control_plane and nodes run k3s
  • I can access the drupal website via 192.168.2.52 (wlan0 of rockpi) after main.yml has finished
  • all pods seem to be running
root@node1:/home/rock# kubectl get nodes
NAME          STATUS   ROLES                  AGE   VERSION
node5         Ready    <none>                 51m   v1.29.5+k3s1
node3         Ready    <none>                 51m   v1.29.5+k3s1
node4         Ready    <none>                 51m   v1.29.5+k3s1
node1   Ready    control-plane,master   52m   v1.29.5+k3s1
node2         Ready    <none>                 51m   v1.29.5+k3s1

root@node1:/home/rock# kubectl get pods
NAME                                                    READY   STATUS    RESTARTS   AGE
nfs-subdir-external-provisioner-7df9c8b467-256mg        1/1     Running   0          50m
cluster-monitoring-prometheus-node-exporter-mm9gw       1/1     Running   0          49m
cluster-monitoring-prometheus-node-exporter-2xm24       1/1     Running   0          49m
cluster-monitoring-prometheus-node-exporter-lf4hn       1/1     Running   0          49m
cluster-monitoring-prometheus-node-exporter-ggg5z       1/1     Running   0          49m
cluster-monitoring-prometheus-node-exporter-h4kps       1/1     Running   0          49m
cluster-monitoring-kube-state-metrics-df8db86bb-zq4lz   1/1     Running   0          49m
cluster-monitoring-kube-pr-operator-b44c59f5d-8qp84     1/1     Running   0          49m
cluster-monitoring-grafana-5b4dd85976-8cv2m             3/3     Running   0          49m
prometheus-cluster-monitoring-kube-pr-prometheus-0      2/2     Running   0          48m
@alehanderoo
Copy link
Author

Got it working already!
Posting it here for anyone having the same issue.

run kubectl edit svc cluster-monitoring-grafana -n default on control_plane node.
This will show the following vi editor.

Change the type to NodePort and add the nodePort port.

# Please edit the object below. Lines beginning with a '#' will be ignored,
# and an empty file will abort the edit. If an error occurs while saving this file will be
# reopened with the relevant failures.
#
apiVersion: v1
kind: Service
metadata:
  annotations:
    meta.helm.sh/release-name: cluster-monitoring
    meta.helm.sh/release-namespace: default
  creationTimestamp: "2024-06-04T15:41:14Z"
  labels:
    app.kubernetes.io/instance: cluster-monitoring
    app.kubernetes.io/managed-by: Helm
    app.kubernetes.io/name: grafana
    app.kubernetes.io/version: 10.4.1
    helm.sh/chart: grafana-7.3.11
  name: cluster-monitoring-grafana
  namespace: default
  resourceVersion: "14694"
  uid: 2b047274-31cf-413b-8dc2-14b8571a8330
spec:
  clusterIP: 10.43.27.129
  clusterIPs:
  - 10.43.27.129
  externalTrafficPolicy: Cluster
  internalTrafficPolicy: Cluster
  ipFamilies:
  - IPv4
  ipFamilyPolicy: SingleStack
  ports:
  - name: http-web
    nodePort: 30080 # Optional: specify a port, or leave it to let Kubernetes assign one
    port: 80
    protocol: TCP
    targetPort: 3000
  selector:
    app.kubernetes.io/instance: cluster-monitoring
    app.kubernetes.io/name: grafana
  sessionAffinity: None
  type: NodePort
status:
  loadBalancer: {}

run kubectl get svc cluster-monitoring-grafana -n default to validate the settings.

NAME                         TYPE       CLUSTER-IP     EXTERNAL-IP   PORT(S)        AGE
cluster-monitoring-grafana   NodePort   10.43.27.129   <none>        80:30080/TCP   4h27m

@alehanderoo
Copy link
Author

Does not seem to work after a reboot.

@BicycleJohny
Copy link

It is probably because it is handeld by helm. I have the same issue and I am trying to convice helm to do it

@BicycleJohny
Copy link

BicycleJohny commented Sep 19, 2024

UPDATE:
Ok, it wasn't that hard after all. You can either extend this file on tasks[1].values with:

grafana:
    service:
      type: NodePort
      nodePort:30080

and uninstall with helm and reinstall it with ansible. Or you can just uninstall it with helm and put all the values into file like values.yml:

alertmanager:
  enabled: false
grafana:
  service:
    type: NodePort
    nodePort: 30080

And then install it again with helm:

helm install prometheus-stack prometheus-community/kube-prometheus-stack -f values.yaml --kubeconfig /etc/rancher/k3s/k3s.yaml

It then creates Grafana service with type NodePort accessible from specified port

NAME                                        TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)             AGE
kubernetes                                  ClusterIP   10.43.0.1       <none>        443/TCP             40h
prometheus-operated                         ClusterIP   None            <none>        9090/TCP            13m
prometheus-stack-grafana                    NodePort    10.43.10.202    <none>        80:30080/TCP        13m
prometheus-stack-kube-prom-operator         ClusterIP   10.43.89.109    <none>        443/TCP             13m
prometheus-stack-kube-prom-prometheus       ClusterIP   10.43.61.6      <none>        9090/TCP,8080/TCP   13m
prometheus-stack-kube-state-metrics         ClusterIP   10.43.165.23    <none>        8080/TCP            13m
prometheus-stack-prometheus-node-exporter   ClusterIP   10.43.155.255   <none>        9100/TCP            13m

@BicycleJohny
Copy link

BicycleJohny commented Sep 19, 2024

Btw @alehanderoo, kubectl port-forward is temporary thing and it should wait for termination from user - it creates temporary forward rule and waits until you are finish (ctrl+c). That is why it looks like it hangs

RanchoHam added a commit to RanchoHam/pi-cluster that referenced this issue Sep 25, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants