Skip to content

Using Redis sentinel, some commands 'send' hangs #92

@jgwinner

Description

@jgwinner

Describe the bug
When using Redis sentinel - 1 master, 2 duplicates, 3 sentinels - I can always connect to the (curent) writeable master.

However, if I 'stop' that service, the next command hangs.

This may not be a bug, maybe it's proper use of Sentinel, but if this is the case the examples and/or docs should probably be updated. I can do a PR if that's the case.

To Reproduce
Steps to reproduce the behavior:

  1. Implement Redis Sentinel
  2. Connect to the current master
  3. Issue an 'lpop' command:
	redisClient->lpop(GetQueue().toStdString(), [&webReply,&bError](cpp_redis::reply& charRequest) {
                      .... stuff ... 
		});
	redisClient->sync_commit(std::chrono::milliseconds(5000)); 
  1. It works fine.
  2. stop the current master. The slaves will fail over and promote a new master.
  3. The next 'lpop' command hangs down inside this routine:
client::send(const std::vector<std::string> &redis_cmd, const reply_callback_t &callback) {
	std::lock_guard<std::mutex> lock_callback(m_callbacks_mutex);
	__CPP_REDIS_LOG(info, "cpp_redis::client attempts to store new command in the send buffer");
	unprotected_send(redis_cmd, callback);
	__CPP_REDIS_LOG(info, "cpp_redis::client stored new command in the send buffer");

Expected behavior
I would think issuing any serial commands to a connection that can't take it should result in a tacopie or cpp_redis error being thrown.

Possible workaround
My routine that does the connect DOES immediately exits out:

	LOG_INFO << "Connecting to HA Redis servers with " << iSentinels << "sentinels";
	bool bDisconnect = false;
	redisClient->connect("ourcluster", [&bDisconnect](const std::string& host, std::size_t port, cpp_redis::connect_state status) {
		if (status == cpp_redis::connect_state::dropped) {
			LOG_INFO << "Client disconnected from " << host.c_str() << ":" << port;

The 'dropped' fires right off, so I could assign a bool value, and then in the outer loop, before every command check to see if we're disconnected - but that seems like a lot of potential boilerplate code.

I guess I could also just try to reconnect inside the 'connect' callback? Is that the right way to do it? The code example didn't show this

Desktop (please complete the following information):

  • OS: [Windows 10]
  • Browser [n/a]
  • Version [Current]

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions