Discussion:
[mongodb-user] error while trying to stepdown mongodb replica set
Praveen
2017-06-10 15:02:31 UTC
Permalink
Hi,

Setup: 1 primary, 3 secondaries and 1 arbiter (All 5 are running in the
same machine)
When I'm doing a stepDown() in primary, getting the below error in the
shell -

m101:PRIMARY> rs.stepDown() 2017-06-07T15:01:21.357 E QUERY [thread1] Error: error doing query: failed: network error while attempting to run command 'replSetStepDown' on host '127.0.0.1:27018' :
***@src/mongo/shell/db.js:132:1
***@src/mongo/shell/db.js:150:16
***@src/mongo/shell/utils.js:1261:12@(shell):1:1 2017-06-07T15:01:21.360 I NETWORK [thread1] trying reconnect to 127.0.0.1:27018 (127.0.0.1) failed2017-06-07T15:01:21.361 I NETWORK [thread1] reconnect 127.0.0.1:27018 (127.0.0.1) ok
m101:SECONDARY>



Note that the primary changed to secondary in the end. But why I'm getting
this error, "failed: network error" here?


Below is the section from the log file of the primary which was stepped
down -


2017-06-07T15:01:22.170 I REPL [replication-5] Restarting oplog query due to error: InterruptedDueToReplStateChange: operation was interrupted. Last fetched optime (with hash): { ts: Timestamp 1496827872000|1, t: 2 }[-3490433912114125886]. Restarts remaining: 32017-06-07T15:01:22.170 I REPL [replication-5] Scheduled new oplog query Fetcher source: localhost:27018 database: local query: { find: "oplog.rs", filter: { ts: { $gte: Timestamp 1496827872000|1 } }, tailable: true, oplogReplay: true, awaitData: true, maxTimeMS: 60000, term: 2 } query metadata: { $replData: 1, $oplogQueryData: 1, $ssm: { $secondaryOk: true } } active: 1 timeout: 65000ms shutting down?: 0 first: 1 firstCommandScheduler: RemoteCommandRetryScheduler request: RemoteCommand 159434 -- target:localhost:27018 db:local cmd:{ find: "oplog.rs", filter: { ts: { $gte: Timestamp 1496827872000|1 } }, tailable: true, oplogReplay: true, awaitData: true, maxTimeMS: 60000, term: 2 } active: 1 callbackHandle.valid: 1 callbackHandle.cancelled: 0 attempt: 1 retryPolicy: RetryPolicyImpl maxAttempts: 1 maxTimeMillis: -1ms2017-06-07T15:01:22.173 W REPL [rsBackgroundSync] Fetcher stopped querying remote oplog with error: InvalidSyncSource: Sync source's last applied OpTime { ts: Timestamp 1496827872000|1, t: 2 } is not greater than our last fetched OpTime { ts: Timestamp 1496827872000|1, t: 2 }. Choosing new sync source.2017-06-07T15:01:22.176 I REPL [rsBackgroundSync] could not find member to sync from2017-06-07T15:01:22.180 I REPL [ReplicationExecutor] Member localhost:27018 is now in state SECONDARY2017-06-07T15:01:27.335 I REPL [SyncSourceFeedback] SyncSourceFeedback error sending update to localhost:27018: InvalidSyncSource: Sync source was cleared. Was localhost:270182017-06-07T15:01:29.879 I - [conn2] end connection 127.0.0.1:59400 (5 connections now open)2017-06-07T15:01:30.899 I NETWORK [thread1] connection accepted from 127.0.0.1:64421 #28 (5 connections now open)2017-06-07T15:01:30.899 I NETWORK [conn28] received client metadata from 127.0.0.1:64421 conn28: { driver: { name: "NetworkInterfaceASIO-Replication", version: "3.4.4" }, os: { type: "Windows", name: "Microsoft Windows 8", architecture: "x86_64", version: "6.2 (build 9200)" } }2017-06-07T15:01:30.917 I NETWORK [thread1] connection accepted from 127.0.0.1:64425 #29 (6 connections now open)2017-06-07T15:01:30.920 I NETWORK [conn29] received client metadata from 127.0.0.1:64425 conn29: { driver: { name: "NetworkInterfaceASIO-Replication", version: "3.4.4" }, os: { type: "Windows", name: "Microsoft Windows 8", architecture: "x86_64", version: "6.2 (build 9200)" } }2017-06-07T15:01:30.927 I NETWORK [thread1] connection accepted from 127.0.0.1:64428 #30 (7 connections now open)2017-06-07T15:01:30.928 I NETWORK [conn30] received client metadata from 127.0.0.1:64428 conn30: { driver: { name: "NetworkInterfaceASIO-Replication", version: "3.4.4" }, os: { type: "Windows", name: "Microsoft Windows 8", architecture: "x86_64", version: "6.2 (build 9200)" } }2017-06-07T15:01:31.561 I REPL [ReplicationExecutor] Starting an election, since we've seen no PRIMARY in the past 10000ms2017-06-07T15:01:31.562 I REPL [ReplicationExecutor] conducting a dry run election to see if we could be elected2017-06-07T15:01:31.563 I REPL [ReplicationExecutor] VoteRequester(term 2 dry run) received a no vote from localhost:27018 with reason "candidate's term is lower than mine"; response message: { term: 3, voteGranted: false, reason: "candidate's term is lower than mine", ok: 1.0 }2017-06-07T15:01:31.563 I REPL [ReplicationExecutor] not running for primary, we have been superceded already2017-06-07T15:01:32.183 I REPL [ReplicationExecutor] Member localhost:27019 is now in state PRIMARY2017-06-07T15:01:32.185 I REPL [rsBackgroundSync] sync source candidate: localhost:270192017-06-07T15:01:32.186 I ASIO [NetworkInterfaceASIO-RS-0] Connecting to localhost:270192017-06-07T15:01:33.198 I ASIO [NetworkInterfaceASIO-RS-0] Successfully connected to localhost:27019, took 1012ms (1 connections now open to localhost:27019)


Trying to understand what's happening here. Any leads would be appreciated.


Posted in StackOverflow earlier
(https://stackoverflow.com/questions/44409747/error-while-trying-to-stepdown-mongodb-replica-set)


Thanks in advance!
--
You received this message because you are subscribed to the Google Groups "mongodb-user"
group.

For other MongoDB technical support options, see: https://docs.mongodb.com/manual/support/
---
You received this message because you are subscribed to the Google Groups "mongodb-user" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mongodb-user+***@googlegroups.com.
To post to this group, send email to mongodb-***@googlegroups.com.
Visit this group at https://groups.google.com/group/mongodb-user.
To view this discussion on the web visit https://groups.google.com/d/msgid/mongodb-user/87d0a775-5564-4d51-a158-06db31d15992%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
'Stephen Steneker' via mongodb-user
2017-06-12 12:39:44 UTC
Permalink
Post by Praveen
Setup: 1 primary, 3 secondaries and 1 arbiter (All 5 are running in the
same machine)
When I'm doing a stepDown() in primary, getting the below error in the
shell -
m101:SECONDARY>
Note that the primary changed to secondary in the end. But why I'm getting
this error, "failed: network error" here?
Hi Praveen,

This is expected behaviour. When a primary steps down it drops all
connections, including your active connection in the mongo shell. The mongo
shell then reestablishes the connection (as per the "trying reconnect"
message).

The rs.stepDown()
<https://docs.mongodb.com/manual/reference/method/rs.stepDown/>
documentation could be clearer on this point with an example of the shell
output, but the relevant description is:

Upon successful stepdown, rs.stepDown() forces all clients currently
Post by Praveen
connected to the database to disconnect. This helps ensure that the clients
maintain an accurate view of the replica set.
Because the disconnect includes the connection used to run the command, you
Post by Praveen
cannot retrieve the return status of the command if the command completes
successfully; i.e. you can only retrieve the return status of the command
if it errors. When running the command in a script, the script should
account for this behavior.
Regards,
Stennie
--
You received this message because you are subscribed to the Google Groups "mongodb-user"
group.

For other MongoDB technical support options, see: https://docs.mongodb.com/manual/support/
---
You received this message because you are subscribed to the Google Groups "mongodb-user" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mongodb-user+***@googlegroups.com.
To post to this group, send email to mongodb-***@googlegroups.com.
Visit this group at https://groups.google.com/group/mongodb-user.
To view this discussion on the web visit https://groups.google.com/d/msgid/mongodb-user/be7cee61-5e4b-4b7b-857b-e0023798c68b%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Loading...