Ansible пропуск ошибок - TopOshibok.ru - решение и исправление самых разных ошибок

When Ansible receives a non-zero return code from a command or a failure from a module, by default it stops executing on that host and continues on other hosts. However, in some circumstances you may want different behavior. Sometimes a non-zero return code indicates success. Sometimes you want a failure on one host to stop execution on all hosts. Ansible provides tools and settings to handle these situations and help you get the behavior, output, and reporting you want.

Ignoring failed commands

By default Ansible stops executing tasks on a host when a task fails on that host. You can use ignore_errors to continue on in spite of the failure.

- name: Do not count this as a failure
  ansible.builtin.command: /bin/false
  ignore_errors: true

The ignore_errors directive only works when the task is able to run and returns a value of ‘failed’. It does not make Ansible ignore undefined variable errors, connection failures, execution issues (for example, missing packages), or syntax errors.

Ignoring unreachable host errors

New in version 2.7.

You can ignore a task failure due to the host instance being ‘UNREACHABLE’ with the ignore_unreachable keyword. Ansible ignores the task errors, but continues to execute future tasks against the unreachable host. For example, at the task level:

- name: This executes, fails, and the failure is ignored
  ansible.builtin.command: /bin/true
  ignore_unreachable: true

- name: This executes, fails, and ends the play for this host
  ansible.builtin.command: /bin/true

And at the playbook level:

- hosts: all
  ignore_unreachable: true
  tasks:
  - name: This executes, fails, and the failure is ignored
    ansible.builtin.command: /bin/true

  - name: This executes, fails, and ends the play for this host
    ansible.builtin.command: /bin/true
    ignore_unreachable: false

Resetting unreachable hosts

If Ansible cannot connect to a host, it marks that host as ‘UNREACHABLE’ and removes it from the list of active hosts for the run. You can use meta: clear_host_errors to reactivate all hosts, so subsequent tasks can try to reach them again.

Handlers and failure

Ansible runs handlers at the end of each play. If a task notifies a handler but
another task fails later in the play, by default the handler does not run on that host,
which may leave the host in an unexpected state. For example, a task could update
a configuration file and notify a handler to restart some service. If a
task later in the same play fails, the configuration file might be changed but
the service will not be restarted.

You can change this behavior with the --force-handlers command-line option,
by including force_handlers: True in a play, or by adding force_handlers = True
to ansible.cfg. When handlers are forced, Ansible will run all notified handlers on
all hosts, even hosts with failed tasks. (Note that certain errors could still prevent
the handler from running, such as a host becoming unreachable.)

Defining failure

Ansible lets you define what “failure” means in each task using the failed_when conditional. As with all conditionals in Ansible, lists of multiple failed_when conditions are joined with an implicit and, meaning the task only fails when all conditions are met. If you want to trigger a failure when any of the conditions is met, you must define the conditions in a string with an explicit or operator.

You may check for failure by searching for a word or phrase in the output of a command

- name: Fail task when the command error output prints FAILED
  ansible.builtin.command: /usr/bin/example-command -x -y -z
  register: command_result
  failed_when: "'FAILED' in command_result.stderr"

or based on the return code

- name: Fail task when both files are identical
  ansible.builtin.raw: diff foo/file1 bar/file2
  register: diff_cmd
  failed_when: diff_cmd.rc == 0 or diff_cmd.rc >= 2

You can also combine multiple conditions for failure. This task will fail if both conditions are true:

- name: Check if a file exists in temp and fail task if it does
  ansible.builtin.command: ls /tmp/this_should_not_be_here
  register: result
  failed_when:
    - result.rc == 0
    - '"No such" not in result.stdout'

If you want the task to fail when only one condition is satisfied, change the failed_when definition to

failed_when: result.rc == 0 or "No such" not in result.stdout

If you have too many conditions to fit neatly into one line, you can split it into a multi-line YAML value with >.

- name: example of many failed_when conditions with OR
  ansible.builtin.shell: "./myBinary"
  register: ret
  failed_when: >
    ("No such file or directory" in ret.stdout) or
    (ret.stderr != '') or
    (ret.rc == 10)

Defining “changed”

Ansible lets you define when a particular task has “changed” a remote node using the changed_when conditional. This lets you determine, based on return codes or output, whether a change should be reported in Ansible statistics and whether a handler should be triggered or not. As with all conditionals in Ansible, lists of multiple changed_when conditions are joined with an implicit and, meaning the task only reports a change when all conditions are met. If you want to report a change when any of the conditions is met, you must define the conditions in a string with an explicit or operator. For example:

tasks:

  - name: Report 'changed' when the return code is not equal to 2
    ansible.builtin.shell: /usr/bin/billybass --mode="take me to the river"
    register: bass_result
    changed_when: "bass_result.rc != 2"

  - name: This will never report 'changed' status
    ansible.builtin.shell: wall 'beep'
    changed_when: False

You can also combine multiple conditions to override “changed” result.

- name: Combine multiple conditions to override 'changed' result
  ansible.builtin.command: /bin/fake_command
  register: result
  ignore_errors: True
  changed_when:
    - '"ERROR" in result.stderr'
    - result.rc == 2

Note

Just like when these two conditionals do not require templating delimiters ({{ }}) as they are implied.

See Defining failure for more conditional syntax examples.

Ensuring success for command and shell

The command and shell modules care about return codes, so if you have a command whose successful exit code is not zero, you can do this:

tasks:
  - name: Run this command and ignore the result
    ansible.builtin.shell: /usr/bin/somecommand || /bin/true

Aborting a play on all hosts

Sometimes you want a failure on a single host, or failures on a certain percentage of hosts, to abort the entire play on all hosts. You can stop play execution after the first failure happens with any_errors_fatal. For finer-grained control, you can use max_fail_percentage to abort the run after a given percentage of hosts has failed.

Aborting on the first error: any_errors_fatal

If you set any_errors_fatal and a task returns an error, Ansible finishes the fatal task on all hosts in the current batch, then stops executing the play on all hosts. Subsequent tasks and plays are not executed. You can recover from fatal errors by adding a rescue section to the block. You can set any_errors_fatal at the play or block level.

- hosts: somehosts
  any_errors_fatal: true
  roles:
    - myrole

- hosts: somehosts
  tasks:
    - block:
        - include_tasks: mytasks.yml
      any_errors_fatal: true

You can use this feature when all tasks must be 100% successful to continue playbook execution. For example, if you run a service on machines in multiple data centers with load balancers to pass traffic from users to the service, you want all load balancers to be disabled before you stop the service for maintenance. To ensure that any failure in the task that disables the load balancers will stop all other tasks:

---
- hosts: load_balancers_dc_a
  any_errors_fatal: true

  tasks:
    - name: Shut down datacenter 'A'
      ansible.builtin.command: /usr/bin/disable-dc

- hosts: frontends_dc_a

  tasks:
    - name: Stop service
      ansible.builtin.command: /usr/bin/stop-software

    - name: Update software
      ansible.builtin.command: /usr/bin/upgrade-software

- hosts: load_balancers_dc_a

  tasks:
    - name: Start datacenter 'A'
      ansible.builtin.command: /usr/bin/enable-dc

In this example Ansible starts the software upgrade on the front ends only if all of the load balancers are successfully disabled.

Setting a maximum failure percentage

By default, Ansible continues to execute tasks as long as there are hosts that have not yet failed. In some situations, such as when executing a rolling update, you may want to abort the play when a certain threshold of failures has been reached. To achieve this, you can set a maximum failure percentage on a play:

---
- hosts: webservers
  max_fail_percentage: 30
  serial: 10

The max_fail_percentage setting applies to each batch when you use it with serial. In the example above, if more than 3 of the 10 servers in the first (or any) batch of servers failed, the rest of the play would be aborted.

Note

The percentage set must be exceeded, not equaled. For example, if serial were set to 4 and you wanted the task to abort the play when 2 of the systems failed, set the max_fail_percentage at 49 rather than 50.

Controlling errors in blocks

You can also use blocks to define responses to task errors. This approach is similar to exception handling in many programming languages. See Handling errors with blocks for details and examples.

Источник

I found this to be helpful:

https://medium.com/opsops/anternative-way-to-handle-errors-in-ansible-245a066c340

In your task you want to register the task.

register: some_name

Then add ignore_errors: yes

Then use set_fact to get each register attribute:

- set_fact:
    success: '{{ not([e1, e2]|map(attribute="failed")|max) }}'

Then place this at the end of your block:

- name: Fail server build
  command: >
    bash scripts/test_file.sh
  when: success == false
  ignore_errors: yes

The block above would only be executed when success is false. The key is using ignore_errors and making a register. From the link I posted and from my testing the task attribute is registered if it fails or not.

Example output:

PLAY [localhost] ***********************************************************************************************

TASK [Gathering Facts] *****************************************************************************************
ok: [localhost]

TASK [Task 1 test] *********************************************************************************************
fatal: [localhost]: FAILED! => {"changed": true, "cmd": ["bash", "scripts/unknown_file.sh"], "delta": "0:00:00.004343", "end": "2021-10-20 14:20:59.320389", "msg": "non-zero return code", "rc": 127, "start": "2021-10-20 14:20:59.316046", "stderr": "bash: scripts/unknown_file.sh: No such file or directory", "stderr_lines": ["bash: scripts/unknown_file.sh: No such file or directory"], "stdout": "", "stdout_lines": []}
...ignoring

TASK [Task 2 test] *********************************************************************************************
changed: [localhost]

TASK [set_fact] ************************************************************************************************
ok: [localhost]

TASK [Fail server build] ***************************************************************************************
changed: [localhost]

TASK [debug] ***************************************************************************************************
ok: [localhost] => {
    "success": false
}

PLAY RECAP *****************************************************************************************************
localhost                  : ok=6    changed=3    unreachable=0    failed=0    skipped=0    rescued=0    ignored=1

Источник

Обработка ошибок в плейбуках

Когда Ansible получает ненулевой код возврата от команды или сбоя от модуля,по умолчанию он прекращает выполнение на этом хосте и продолжается на других хостах.Тем не менее,в некоторых случаях вам может потребоваться иное поведение.Иногда ненулевой код возврата указывает на успех.Иногда вы хотите,чтобы сбой на одном хосте остановил выполнение на всех хостах.Ansible предоставляет инструменты и настройки,чтобы справиться с этими ситуациями и помочь вам получить поведение,вывод и отчетность вы хотите.

Игнорирование неудачных команд
игнорирование недоступных ошибок хоста
Сброс недоступных хостов
Дескрипторы и отказ
Defining failure
Defining “changed”
Обеспечение успеха для командования и снаряда
Прерывание игры на всех хозяевах
- Прерывание первой ошибки:any_errors_fatal
- Установка максимального процента отказа
Ошибки управления блоками

Игнорирование неудачных команд

По умолчанию Ansible прекращает выполнение задач на хосте при сбое задачи на этом хосте. Вы можете использовать ignore_errors , чтобы продолжить несмотря на сбой:

- name: Do not count this as a failure
  ansible.builtin.command: /bin/false
  ignore_errors: yes

Директива ignore_errors работает только тогда, когда задача может быть запущена и возвращает значение «сбой». Это не заставляет Ansible игнорировать ошибки неопределенных переменных, сбои соединения, проблемы с выполнением (например, отсутствующие пакеты) или синтаксические ошибки.

игнорирование недоступных ошибок хоста

Новинка в версии 2.7.

Вы можете игнорировать сбой задачи из-за того, что экземпляр хоста недоступен с ключевым словом ignore_unreachable . Ansible игнорирует ошибки задачи, но продолжает выполнять будущие задачи на недостижимом хосте. Например, на уровне задачи:

- name: This executes, fails, and the failure is ignored
  ansible.builtin.command: /bin/true
  ignore_unreachable: yes

- name: This executes, fails, and ends the play for this host
  ansible.builtin.command: /bin/true

И на игровом уровне:

- hosts: all
  ignore_unreachable: yes
  tasks:
  - name: This executes, fails, and the failure is ignored
    ansible.builtin.command: /bin/true

  - name: This executes, fails, and ends the play for this host
    ansible.builtin.command: /bin/true
    ignore_unreachable: no

Сброс недоступных хостов

Если Ansible не может подключиться к хосту, он помечает этот хост как «НЕДОСТУПНЫЙ» и удаляет его из списка активных хостов для выполнения. Вы можете использовать meta: clear_host_errors для повторной активации всех хостов, чтобы последующие задачи могли снова попытаться связаться с ними.

Дескрипторы и отказ

Ansible runs handlers at the end of each play. If a task notifies a handler but another task fails later in the play, by default the handler does not run on that host, which may leave the host in an unexpected state. For example, a task could update a configuration file and notify a handler to restart some service. If a task later in the same play fails, the configuration file might be changed but the service will not be restarted.

Вы можете изменить это поведение с --force-handlers опций командной строки, в том числе путем force_handlers: True в пьесе, или путем добавления force_handlers = True в ansible.cfg. Когда обработчики принудительно запущены, Ansible будет запускать все обработчики уведомлений на всех хостах, даже на хостах с неудачными задачами. (Обратите внимание, что некоторые ошибки все еще могут помешать запуску обработчика, например, когда хост становится недоступным.)

Defining failure

Ansible позволяет определить, что означает «сбой» в каждой задаче, используя условие failed_when . Как и все условные операторы в Ansible, списки нескольких условий failed_when объединяются неявным оператором and , что означает, что задача завершается сбоем только при соблюдении всех условий. Если вы хотите инициировать сбой при выполнении любого из условий, вы должны определить условия в строке с явным оператором or .

Проверить на неудачу можно с помощью поиска слова или фразы в выводе команды:

- name: Fail task when the command error output prints FAILED
  ansible.builtin.command: /usr/bin/example-command -x -y -z
  register: command_result
  failed_when: "'FAILED' in command_result.stderr"

или на основании кода возврата:

- name: Fail task when both files are identical
  ansible.builtin.raw: diff foo/file1 bar/file2
  register: diff_cmd
  failed_when: diff_cmd.rc == 0 or diff_cmd.rc >= 2

Вы также можете комбинировать несколько условий для отказа.Эта задача будет неудачной,если оба условия верны:

- name: Check if a file exists in temp and fail task if it does
  ansible.builtin.command: ls /tmp/this_should_not_be_here
  register: result
  failed_when:
    - result.rc == 0
    - '"No such" not in result.stdout'

Если вы хотите, чтобы задача не выполнялась при выполнении только одного условия, измените определение failed_when на:

failed_when: result.rc == 0 or "No such" not in result.stdout

Если у вас слишком много условий для аккуратного размещения в одной строке, вы можете разделить его на многострочное значение yaml с помощью > :

- name: example of many failed_when conditions with OR
  ansible.builtin.shell: "./myBinary"
  register: ret
  failed_when: >
    ("No such file or directory" in ret.stdout) or
    (ret.stderr != '') or
    (ret.rc == 10)

Defining “changed”

Ansible позволяет вам определить, когда конкретная задача «изменила» удаленный узел, используя условное changed_when . Это позволяет вам определить, на основе кодов возврата или вывода, следует ли сообщать об изменении в статистике Ansible и должен ли запускаться обработчик или нет. Как и все условные операторы в Ansible, списки нескольких условий changed_when объединяются неявным оператором and , что означает, что задача сообщает об изменении только тогда, когда все условия соблюдены. Если вы хотите сообщить об изменении при выполнении любого из условий, вы должны определить условия в строке с явным оператором or .Например:

tasks:

  - name: Report 'changed' when the return code is not equal to 2
    ansible.builtin.shell: /usr/bin/billybass --mode="take me to the river"
    register: bass_result
    changed_when: "bass_result.rc != 2"

  - name: This will never report 'changed' status
    ansible.builtin.shell: wall 'beep'
    changed_when: False

Вы также можете объединить несколько условий,чтобы отменить результат «изменено»:

- name: Combine multiple conditions to override 'changed' result
  ansible.builtin.command: /bin/fake_command
  register: result
  ignore_errors: True
  changed_when:
    - '"ERROR" in result.stderr'
    - result.rc == 2

Дополнительные примеры условного синтаксиса см. В разделе Определение ошибки .

Обеспечение успеха для командования и снаряда

В командных и оболочки модулей заботятся о кодах возврата, поэтому если у вас есть команда , чей успешный код завершения не равен нулю, то вы можете сделать это:

tasks:
  - name: Run this command and ignore the result
    ansible.builtin.shell: /usr/bin/somecommand || /bin/true

Прерывание игры на всех хозяевах

Иногда требуется, чтобы сбой на одном хосте или сбой на определенном проценте хостов прервали всю игру на всех хостах. Вы можете остановить выполнение воспроизведения после первого сбоя с помощью any_errors_fatal . Для более max_fail_percentage управления вы можете использовать max_fail_percentage, чтобы прервать выполнение после сбоя определенного процента хостов.

Прерывание первой ошибки:any_errors_fatal

Если вы устанавливаете any_errors_fatal и задача возвращает ошибку, Ansible завершает фатальную задачу на всех хостах в текущем пакете, а затем прекращает воспроизведение на всех хостах. Последующие задания и спектакли не выполняются. Вы можете избавиться от фатальных ошибок, добавив в блок раздел восстановления. Вы можете установить any_errors_fatal на уровне игры или блока:

- hosts: somehosts
  any_errors_fatal: true
  roles:
    - myrole

- hosts: somehosts
  tasks:
    - block:
        - include_tasks: mytasks.yml
      any_errors_fatal: true

Вы можете использовать эту функцию,когда все задачи должны быть на 100% успешными,чтобы продолжить выполнение Playbook.Например,если вы запускаете сервис на машинах в нескольких центрах обработки данных с балансировщиками нагрузки для передачи трафика от пользователей к сервису,вы хотите,чтобы все балансировщики нагрузки были отключены до того,как вы остановите сервис на техническое обслуживание.Чтобы гарантировать,что любой сбой в задаче,отключающей работу балансировщиков нагрузки,остановит все остальные задачи:

---
- hosts: load_balancers_dc_a
  any_errors_fatal: true

  tasks:
    - name: Shut down datacenter 'A'
      ansible.builtin.command: /usr/bin/disable-dc

- hosts: frontends_dc_a

  tasks:
    - name: Stop service
      ansible.builtin.command: /usr/bin/stop-software

    - name: Update software
      ansible.builtin.command: /usr/bin/upgrade-software

- hosts: load_balancers_dc_a

  tasks:
    - name: Start datacenter 'A'
      ansible.builtin.command: /usr/bin/enable-dc

В данном примере Ansible запускает обновление программного обеспечения на передних концах только в том случае,если все балансировщики нагрузки успешно отключены.

Установка максимального процента отказа

По умолчанию,Ansible продолжает выполнять задачи до тех пор,пока есть хосты,которые еще не вышли из строя.В некоторых ситуациях,например,при выполнении скользящего обновления,вы можете прервать воспроизведение,когда достигнут определенный порог неудач.Для этого вы можете установить максимальный процент сбоев при воспроизведении:

---
- hosts: webservers
  max_fail_percentage: 30
  serial: 10

Параметр max_fail_percentage применяется к каждому пакету, когда вы используете его с последовательным интерфейсом . В приведенном выше примере, если более 3 из 10 серверов в первой (или любой) группе серверов вышли из строя, остальная часть игры будет прервана.

Note

Установленный процент должен быть превышен,а не равен.Например,если серийный набор установлен на 4 и вы хотите,чтобы задача прерывала воспроизведение при сбое 2-х систем,установите max_fail_percentage на 49,а не на 50.

Ошибки управления блоками

Вы также можете использовать блоки для определения ответов на ошибки задачи. Этот подход похож на обработку исключений во многих языках программирования. См. Подробности и примеры в разделе Обработка ошибок с помощью блоков .

Ansible

Контроль над тем,где выполняются задачи:делегирование и местные действия.

По умолчанию Ansible собирает факты и выполняет все задачи на машинах, которые соответствуют строке hosts из вашего playbook.
Настройка удаленной среды

Новое в версии 1.1.
Использование фильтров для манипулирования данными

Фильтры позволяют преобразовывать данные JSON в разделенный URL-адрес YAML, извлекать имя хоста, получать хэш строки SHA1, добавлять несколько целых чисел и многое другое.
Объединение и выбор данных

Вы можете комбинировать данные из нескольких источников и типов, выбирать значения больших структур, предоставляя точный контроль над комплексом Новое в версии 2.3.

Источник

While configuring the task on another host device, Ansible errors are a common occurrence. They represent unique and possibly significant system states. Despite this, there may be some errors that we want to avoid so that the tasks will also execute and show the output if they are successfully executed. In this article, we’ll talk about Ansible errors and how to disregard them. We specifically demonstrate a technique to supress and ignore the failures with Ansible.

Contrasting to error fixing, avoiding failures entails continuing with the tasks as much as the activity in the Ansible playbook is unaffected. The ansible tool issues a warning message when it cannot finish a task or playbook while debugging a task. There are several causes, but it’s up to us to identify them and find a solution. Unfortunately, not all errors can be fixed. You can choose to start ignoring the errors if you don’t want to or if you are unable to resolve the issue.

The majority of controlling the managers in Ansible use this strategy while connecting with the target hosts in real-world scenarios. Ansible defaults to ceasing activities on a target device and continuing activities on some other servers whenever it returns a non-zero returned result from a statement or an error from a package. Although, there are certain situations where you might want to act differently. A returned result that is not zero occasionally denotes progress. Oftentimes, you might want the processing to halt on one server so that it stops on all hosts.

Ways of Ignoring the Errors in Ansible

In Ansible, different ways are used to carry out the playbook tasks if it shows the task failure. The following are the different ways that will help execute the task even if Ansible is showing errors:

1. Utilizing the Ignore_Errors=True Command

Even though the task continues to fail, the playbook continues to execute if you specify the ignore_errors=true command at the bottom of the activity. Despite caring about the task completion or failures, it still carries out the subsequent activity. If somehow the activity is unsuccessful, it moves on to the next one. If the activity is successful, it completes a process after that.

2. Utilizing the Check Mode in Ansible

Utilize the Boolean special variables, the Ansible check mode, which is defined to True once Ansible is in the checking method to bypass a task or disregard the failures on a task whenever the checking method version of Ansible is used.

3. Utilizing the Failed=When Command in Ansible Playbook

In Ansible, we can also utilize the failed_when conditional to specify whatever “failure” is implied for each activity. Similar to all Ansible conditional statements, the lists of numerous failed_when criteria are combined with an implicit. So, the task only fails if all conditions are satisfied.

Prerequisites to Ignore the Errors in Ansible

The Ansible configuration tool must comply with the necessary criteria to include the concrete example:

An Ansible main server or we can say a controlling server is necessary so that we can configure the commands on the target device.
We need to have the local hosts so that we can contact each of them to use the different ways of ignoring the errors in the Ansible software. We use the local host in this instance as a target remote server for the duration of the activity.
We write the playbooks, run the Ansible ignore error commands, and use the ansible-controller device to track the outcomes on distant hosts.

To assist the learner to grasp the principle of utilizing the ignore error in an Ansible playbook, let’s implement the following example:

Example: Utilizing the Ignore_Errors=True Command

This is the simplest example that uses Ansible for implementation where we include several tasks in the playbook and execute the tasks using the ignore error command. To do this, we first write the following code in the Ansible terminal:

[root@master ansible]# nano ignore_errors.yml

After the creation and launching of the ignore_errors.yml playbook, we now begin to enter the commands in the playbook. First, we use the “hosts” option, passing the supplied hosts as “localhost”. We enter the “false” value in the “gather facts” argument so that we won’t be able to obtain an additional information about the local host when we run the playbook.

After that, we begin listing each task that we wish to complete under the “tasks” option. In the first task, we display the document that is non-existent in the Ansible directory. First, we pass the title of the task which we want to implement. Then, we use the command option and store the non-existent text document and use “ls” so that we can execute the command in the first task. After the first task, we use the ignore_errors=true command so that if the task above the ignore statement has a failure, it ignores the task and moves to the next task and execute it.

We list another task which is used. If the first task fails, the Ansible tool must execute the next task. Then, we use the debug parameter to run the task in the playbook.

— hosts: localhost
gather_facts: false
tasks:
— name: List a non-existent file
command: ls non-existent.txt
ignore_errors: true

— name: continue task after failing
debug:
msg: «Continue task after failure»

Now, we list enough tasks to execute and check the ignore error command. Now, we terminate the playbook and go back to the main terminal. After that, we run the playbook. For that, we use the following statement:

[root@master ansible]# ansible-playbook ignore_errors.yml

After executing the aforementioned command, we get the following output. As you see, the first task that lists a non-existent file shows a failure. But the second task is executed successfully because we ignored the first task by utilizing the ignore_error=true in the playbook.

Conclusion

We learned what the ignore error means in Ansible. We discussed how it functions in the Ansible playbook. We also discussed the different ways of ignoring the error in Ansible while executing the tasks. We implemented an example so that every concept is clear for the user.

About the author

Hello, I am a freelance writer and usually write for Linux and other technology related content

Источник

Topics

Error Handling In Playbooks
- Ignoring Failed Commands
- Handlers and Failure
- Controlling What Defines Failure
- Overriding The Changed Result
- Aborting the play

Ansible normally has defaults that make sure to check the return codes of commands and modules and
it fails fast – forcing an error to be dealt with unless you decide otherwise.

Sometimes a command that returns 0 isn’t an error. Sometimes a command might not always
need to report that it ‘changed’ the remote system. This section describes how to change
the default behavior of Ansible for certain tasks so output and error handling behavior is
as desired.

Ignoring Failed Commands¶

New in version 0.6.

Generally playbooks will stop executing any more steps on a host that
has a failure. Sometimes, though, you want to continue on. To do so,
write a task that looks like this:

- name: this will not be counted as a failure
  command: /bin/false
  ignore_errors: yes

Note that the above system only governs the return value of failure of the particular task,
so if you have an undefined variable used, it will still raise an error that users will need to address.
Neither will this prevent failures on connection nor execution issues, the task must be able to run and
return a value of ‘failed’.

Handlers and Failure¶

New in version 1.9.1.

When a task fails on a host, handlers which were previously notified
will not be run on that host. This can lead to cases where an unrelated failure
can leave a host in an unexpected state. For example, a task could update
a configuration file and notify a handler to restart some service. If a
task later on in the same play fails, the service will not be restarted despite
the configuration change.

You can change this behavior with the --force-handlers command-line option,
or by including force_handlers: True in a play, or force_handlers = True
in ansible.cfg. When handlers are forced, they will run when notified even
if a task fails on that host. (Note that certain errors could still prevent
the handler from running, such as a host becoming unreachable.)

Controlling What Defines Failure¶

New in version 1.4.

Suppose the error code of a command is meaningless and to tell if there
is a failure what really matters is the output of the command, for instance
if the string “FAILED” is in the output.

Ansible in 1.4 and later provides a way to specify this behavior as follows:

- name: this command prints FAILED when it fails
  command: /usr/bin/example-command -x -y -z
  register: command_result
  failed_when: "'FAILED' in command_result.stderr"

In previous version of Ansible, this can be still be accomplished as follows:

- name: this command prints FAILED when it fails
  command: /usr/bin/example-command -x -y -z
  register: command_result
  ignore_errors: True

- name: fail the play if the previous command did not succeed
  fail: msg="the command failed"
  when: "'FAILED' in command_result.stderr"

Overriding The Changed Result¶

New in version 1.3.

When a shell/command or other module runs it will typically report
“changed” status based on whether it thinks it affected machine state.

Sometimes you will know, based on the return code
or output that it did not make any changes, and wish to override
the “changed” result such that it does not appear in report output or
does not cause handlers to fire:

tasks:

  - shell: /usr/bin/billybass --mode="take me to the river"
    register: bass_result
    changed_when: "bass_result.rc != 2"

  # this will never report 'changed' status
  - shell: wall 'beep'
    changed_when: False

Aborting the play¶

Sometimes it’s desirable to abort the entire play on failure, not just skip remaining tasks for a host.

The any_errors_fatal play option will mark all hosts as failed if any fails, causing an immediate abort:

- hosts: somehosts
  any_errors_fatal: true
  roles:
    - myrole

for finer-grained control max_fail_percentage can be used to abort the run after a given percentage of hosts has failed.

Источник

Ignoring failed commands

Ignoring unreachable host errors

Resetting unreachable hosts

Handlers and failure

Defining failure

Defining “changed”

Ensuring success for command and shell

Aborting a play on all hosts

Aborting on the first error: any_errors_fatal

Setting a maximum failure percentage

Controlling errors in blocks

Обработка ошибок в плейбуках

Игнорирование неудачных команд

игнорирование недоступных ошибок хоста

Сброс недоступных хостов

Дескрипторы и отказ

Defining failure

Defining “changed”

Обеспечение успеха для командования и снаряда

Прерывание игры на всех хозяевах

Прерывание первой ошибки:any_errors_fatal

Установка максимального процента отказа

Ошибки управления блоками

Ansible

Ways of Ignoring the Errors in Ansible

Prerequisites to Ignore the Errors in Ansible

Conclusion

About the author

Ignoring Failed Commands¶

Handlers and Failure¶

Controlling What Defines Failure¶

Overriding The Changed Result¶

Aborting the play¶

Интересное по теме: