From ADAM.WEAST at adtran.com Tue May 8 17:59:13 2018 From: ADAM.WEAST at adtran.com (ADAM WEAST) Date: Tue, 8 May 2018 17:59:13 +0000 Subject: [sysrepo-devel] Confirmed Commit Capability (RFC 6241 8.4) Message-ID: <7EDD5E133AC82344973036CBFFC67898C6C97B24@ex-mb1.corp.adtran.com> Mikael / Michal / Radek, Thank you guys for your feedback. We have thought about this and have come up with a netopeer2 based approach. Please let us know what you think. To implement confirmed commit, we would need to save some state data, possibly with a new struct in common.h. This struct would hold some basic information: * bool persist - Indicates that this is a persistent confirmed commit or one tied to the life of a session * nc_session* session - Points to the session that owns the commit if this is not a persistent commit * char* persist_id - Holds the persistent id if this is a persistent commit * possibly a mutex for protecting the struct? A pointer for this struct will be added to the np2srv struct to allow global access. Only one confirmed commit can be active at a time, so it will be sufficient to have a single pointer. A null pointer would indicate that there is no active confirmed commit. Below are the proposed changes to existing netopeer2 operations: op_commit if the tag is present: if there is already a confirmed commit present (e.g. np2srv->confirmed is not null): if this is not the correct session or the wrong persist_id is used: return with an error else: perform the normal sr_copy_config extend the timer else: get high-level access to sysrepo (discussion below) use similar logic to op_getconfig to get a libyang node of the current running datastore use libyang to write the datastore to disk create the confirmed commit struct with the necessary information and save to np2srv restore normal access permissions for sysrepo perform the normal sr_copy_config create and set the timeout timer else (normal or confirming commit): if there is already a confirmed commit present: if this is not the correct session or the wrong persist_id is used: return with an error else: perform the normal sr_copy_config delete previous datastore data from the persistent storage delete confirmed commit struct from np2srv clean up any timer data else: perform the normal sr_copy_config op_cancelcommit (also used for non-persistent session disconnects, timer timeouts, and sessions killed with kill-session) if there is a confirmed commit present: if this is not the correct session or the wrong persist_id is used: return an error else: get high-level access to sysrepo (discussion below) use similar logic to op_getconfig to get a libyang node of the current running datastore use libyang to load the previous datastore state from persistent storage use lyd_diff to find the difference between previous and current datastore state use logic similar to op_editconfig to commit previous state to running restore normal access permissions delete previous datastore data from the persistent storage delete confirmed commit struct from np2srv clean up any timer data else: return an error netopeer2 startup if there is a previous running datastore on persistent storage: // assume netopeer crashed in the middle of a confirmed commit // we need to revert it create temporary session to sysrepo with NACM disabled use similar logic to op_getconfig to get a libyang node of the current running datastore use libyang to load the previous datastore state from persistent storage use lyd_diff to find the difference between previous and current datastore state use logic similar to op_editconfig to commit previous state to running close temporary session op_lock When a user attempts to lock the running datastore, we will add an additional check to ensure there is not an active confirmed commit before granting the lock. Caveats: To save a backup of the running datastore and revert it successfully, we need a way to access all datastore nodes regardless of the permission level of the current user. This corresponds with the sections of the psuedo-code above where we 'get high-level access to sysrepo'. There are specific use cases we can see that would require this. Assume low-privilege user A starts a persistent confirmed commit. High-privilege user B could add a follow-up confirmed commit with changes to data that A would normally not have access to. If A issues a cancel-commit, we would expect the datastore to return to its previous state, including any hidden changes B made. To do this, netopeer2 needs full read and write access to the entire running datastore regardless of the privilege level of the user issuing the initial confirmed commit or cancel commit commands. We have a couple of ideas, but we would like your input regarding what you believe would be the best way. * use sr_session_set_options to disable NACM on the current session. This may not be sufficient however due to the file ACLs on sysrepo's repository files. * create a temporary root (or highest-privilege possible) session into sysrepo. This should give us access to all modules regardless of who installed them into sysrepo. Beyond these ideas, we cannot see a way to backup and restore all data to and from sysrepo without adding new Client Library calls to sysrepo itself. We also want to point out that we can only correctly manage locks on the running datastore if netopeer2 is the only client connected to sysrepo. According to the NETCONF RFC, users are not required to lock the datastore before starting a confirmed commit. However, if the datastore is not locked during an active confirmed commit, users cannot get a new lock on the running datastore until the commit is confirmed or cancelled. All of this logic would be managed within netopeer2 in this proposal. If a second, non-netopeer2 client connected to sysrepo, it could place a sysrepo lock on the running datastore during a netopeer2 confirmed commit. This would leave netopeer2 in a bad state if it attempted to cancel the ongoing confirmed commit. We personally do not foresee a need to have a second client connected to sysrepo, but we wanted to make sure you were aware of this possible limitation going forward. In Michal's last response, it was suggested to implicitly lock the running datastore at the beginning of every confirmed commit. We were afraid this might interfere with confirmed commits with persist-ids. In this situation, we would expect a second netopeer2 session to be allowed to perform commits on the running datastore, regardless of which netopeer2 session started the confirmed commit. During a persistent confirmed commit, the original user may also disconnect, which would normally drop any sysrepo locks associated with the session. Also, based on the snippet of the NETCONF RFC below, we thought that any user could still issue a normal edit-config if the datastore is unlocked, but those changes have the potential to be removed or altered during a cancel commit. We understood this to be a strong suggestion for the user of the system to place a lock before starting a confirming commit, not as a requirement of the implementation to do it automatically. Please let us know if you feel differently. Snippet from RFC 6241 section 8.4.1: For shared configurations, this feature can cause other configuration changes (for example, via other NETCONF sessions) to be inadvertently altered or removed, unless the configuration locking feature is used (in other words, the lock is obtained before the operation is started). Therefore, it is strongly suggested that in order to use this feature with shared configuration datastores, configuration locking SHOULD also be used. We would still like to discuss further the use of the unmodified sr_commit to revert the data. In the unlikely case that a subscriber rejects the previous running datastore state, how should we expect netopeer2 to react? For confirmed commits with a large number of changes, it seems potentially dangerous that a single subscriber could block all changes. If the sr_commit was unsuccessful, how should netopeer2 behave? Right now, we only see two main options. * return rpc-error in confirmed-commit response, but hold on to the internal confirmed commit context so the user can try again * return rpc-error in confirmed-commit response, and throw away internal confirmed commit context (essentially confirming the commit) We would prefer the first option, but this could get tricky if sysrepo returns an error during a timeout or a kill/close-session. Do you have any thoughts on this? Thanks, Adam Weast This message has been classified Public by ADAM WEAST on Tuesday, May 08, 2018 at 12:59:14 PM. -------------- next part -------------- An HTML attachment was scrubbed... URL: From rkrejci at cesnet.cz Wed May 9 12:21:24 2018 From: rkrejci at cesnet.cz (=?UTF-8?B?UmFkZWsgS3JlasSNw60=?=) Date: Wed, 9 May 2018 14:21:24 +0200 Subject: [sysrepo-devel] Confirmed Commit Capability (RFC 6241 8.4) In-Reply-To: <7EDD5E133AC82344973036CBFFC67898C6C97B24@ex-mb1.corp.adtran.com> References: <7EDD5E133AC82344973036CBFFC67898C6C97B24@ex-mb1.corp.adtran.com> Message-ID: <26c62318-8536-7abb-7ae5-36c0665842e4@cesnet.cz> Hi Adam, Dne 8.5.2018 v 19:59 ADAM WEAST napsal(a): > > Mikael / Michal / Radek, > > ? > > Thank you guys for your feedback.? We have thought about this and have come up with a netopeer2 based approach.? Please let us know what you think. > Michal is currently out of the office, so expect feedback next week > ? > > To implement confirmed commit, we would need to save some state data, possibly with a new struct in common.h. This struct would hold some basic information: > > ? > > * bool persist - Indicates that this is a persistent confirmed commit or one tied to the life of a session > > * nc_session* session - Points to the session that owns the commit if this is not a persistent commit > > * char* persist_id - Holds the persistent id if this is a persistent commit > > * possibly a mutex for protecting the struct? > > ? > > A pointer for this struct will be added to the np2srv struct to allow global access. Only one confirmed commit can be active at a time, so it will be sufficient to have a single pointer. A null pointer would indicate that there is no active confirmed commit. > I'm not sure about that "only one" - I'm not able to find the specification text that limits it this way. Actually I'm afraid that since other manipulation (edit-config) with data is allowed, also multiple confirmed commit are formally allowed (despite NETCONF locks should be used which effectively avoid multiple confirmed commits). To be clear, I understand, that limiting it this way really simplify the implementation (and understanding what is happening there), and I vote for such a limitation, I just want to know if this limitation is by specification or by implementation. > ? > > Below are the proposed changes to existing netopeer2 operations: > > op_commit > > ??? if the tag is present: > > ??????? if there is already a confirmed commit present (e.g. np2srv->confirmed is not null): > > ??????????? if this is not the correct session or the wrong persist_id is used: > > ??????????????? return with an error > > ??????????? else: > > ??????????????? perform the normal sr_copy_config > > ??????????????? extend the timer > > ??????? else: > > ??????????? get high-level access to sysrepo (discussion below) > > ??????????? use similar logic to op_getconfig to get a libyang node of the current running datastore > > ??????????? use libyang to write the datastore to disk > > ??????????? create the confirmed commit struct with the necessary information and save to np2srv > > ??????????? restore normal access permissions for sysrepo > > ??????????? perform the normal sr_copy_config > > ??????????? create and set the timeout timer > > ??? else (normal or confirming commit): > > ??????? if there is already a confirmed commit present: > > ?????????? ?if this is not the correct session or the wrong persist_id is used: > > ??????????????? return with an error > > ??????????? else: > > ??????????????? perform the normal sr_copy_config > > ??????????????? delete previous datastore data from the persistent storage > > ?????? ?????????delete confirmed commit struct from np2srv > > ??????????????? clean up any timer data > > ??????? else: > > ??????????? perform the normal sr_copy_config > > ? > > op_cancelcommit (also used for non-persistent session disconnects, timer timeouts, and sessions killed with kill-session) > > ??? if there is a confirmed commit present: > > ??????? if this is not the correct session or the wrong persist_id is used: > > ??????????? return an error > > ??????? else: > > ??????????? get high-level access to sysrepo (discussion below) > > ?????????? ?use similar logic to op_getconfig to get a libyang node of the current running datastore > > ??????????? use libyang to load the previous datastore state from persistent storage > > ??????????? use lyd_diff to find the difference between previous and current datastore state > > ??????????? use logic similar to op_editconfig to commit previous state to running > > ??????????? restore normal access permissions > > ??????????? delete previous datastore data from the persistent storage > > ??????????? delete confirmed commit struct from np2srv > > ??????????? clean up any timer data > > ??? else: > > ??????? return an error > > ? > > netopeer2 startup > > ??? if there is a previous running datastore on persistent storage: > > ??????? // assume netopeer crashed in the middle of a confirmed commit > > ??????? // we need to revert it > > ??????? create temporary session to sysrepo with NACM disabled > > ??????? use similar logic to op_getconfig to get a libyang node of the current running datastore > > ??????? use libyang to load the previous datastore state from persistent storage > > ??????? use lyd_diff to find the difference between previous and current datastore state > > ??????? use logic similar to op_editconfig to commit previous state to running > > ??????? close temporary session > > ? > > op_lock > > When a user attempts to lock the running datastore, we will add an additional check to ensure there is not an active confirmed commit before granting the lock. > > ? > > Caveats: > > ? > > To save a backup of the running datastore and revert it successfully, we need a way to access all datastore nodes regardless of the permission level of the current user. This corresponds with the sections of the psuedo-code above where we 'get high-level access to sysrepo'. There are specific use cases we can see that would require this. Assume low-privilege user A starts a persistent confirmed commit. High-privilege user B could add a follow-up confirmed commit with changes to data that A would normally not have access to. If A issues a cancel-commit, we would expect the datastore to return to its previous state, including any hidden changes B made. To do this, netopeer2 needs full read and write access to the entire running datastore regardless of the privilege level of the user issuing the initial confirmed commit or cancel commit commands. We have a couple of ideas, but we would like your input regarding what you believe would be the best way. > > * use sr_session_set_options to disable NACM on the current session. This may not be sufficient however due to the file ACLs on sysrepo's repository files. > > * create a temporary root (or highest-privilege possible) session into sysrepo. This should give us access to all modules regardless of who installed them into sysrepo. > > Beyond these ideas, we cannot see a way to backup and restore all data to and from sysrepo without adding new Client Library calls to sysrepo itself. > I believe there is already such an internal netopeer's session (used e.g. to get schemas from sysrepo) - np2srv.sr_sess in main.c. > ? > > We also want to point out that we can only correctly manage locks on the running datastore if netopeer2 is the only client connected to sysrepo. According to the NETCONF RFC, users are not required to lock the datastore before starting a confirmed commit. However, if the datastore is not locked during an active confirmed commit, users cannot get a new lock on the running datastore until the commit is confirmed or cancelled. All of this logic would be managed within netopeer2 in this proposal. If a second, non-netopeer2 client connected to sysrepo, it could place a sysrepo lock on the running datastore during a netopeer2 confirmed commit. This would leave netopeer2 in a bad state if it attempted to cancel the ongoing confirmed commit. We personally do not foresee a need to have a second client connected to sysrepo, but we wanted to make sure you were aware of this possible limitation going forward. > As Michal wrote, I'm for using sr_lock_datastore() when starting confirmed commit in netopeer2-server. To your note about persist and actually splitting commit and confirm between 2 NETCONF sessions, what about doing the lock on that netopeer's internal sysrepo session? I'm not sure (will wait for Michal for his view), because the change itself must be done on the NETCONF client's sysrepo session (because of NACM), while this locking (granting access only to netopeer2) would be done in the netopeer's internal session (to make it persistent) - I believe it is doable despite it is not straightforward. And it also solves the problem with releasing this lock when NETCONF session is terminated. > In Michal's last response, it was suggested to implicitly lock the running datastore at the beginning of every confirmed commit. We were afraid this might interfere with confirmed commits with persist-ids. In this situation, we would expect a second netopeer2 session to be allowed to perform commits on the running > actually this is the question - specification actually "strongly suggest" to use (NETCONF) locking by (NETCONF) clients. With the approach proposed by Michal, the locking is actually forced. Respectively, the changes with ongoing confirmed commit are prohibited. The question is if we are fine with such an implementation. I am fine, because otherwise it would really complicated implementation and introduced the problems you have mentioned. What other developers thinks? > datastore, regardless of which netopeer2 session started the confirmed commit. During a persistent confirmed commit, the original user may also disconnect, which would normally drop any sysrepo locks associated with the session. Also, based on the snippet of the NETCONF RFC below, we thought that any user could still issue a normal edit-config if the datastore is unlocked, but those changes have the potential to be removed or altered during a cancel commit. We understood this to be a strong suggestion for the user of the system to place a lock before starting a confirming commit, not as a requirement of the implementation to do it automatically. Please let us know if you feel differently. > I think it is more recommendation for the client perforimg those edit-config changes - for them the lock will not be granted because of the ongoing confirmed commit. And as you wrote, in case of NETCONF locking before confirmed commit, in case of closing the session (which can be expected in case of using confirmed commit, e.g. because of changing management interface configuration), the NETCONF lock will be automatically released. I would like to have the implementation without internal sysrepo locking and allowing edit-config while there is ongoing confirmed commit. But I'm a little skeptic about such an ultimate solution in case you don't have full control of the data access. Yes, this is the reason for your initial idea to implement it inside sysrepo. Anyway, I would still prefer the more simpe implementation in netopeer2, which does not cover all the cases / does not allow all the possibilities mentioned in specification. It is more strict than specification, but just to be more safe (avoid inconsistency states) and I don't thing the limitation is critical in this case. So, the confirmed commit would be implemented as an operation that takes some (specified) time and during this time it is not possible to manipulate with the datastore content. Is it fine for everyone? > ? > > Snippet from RFC 6241 section 8.4.1: > > For shared configurations, this feature can cause other configuration changes (for example, via other NETCONF sessions) to be inadvertently altered or removed, unless the configuration locking feature is used (in other words, the lock is obtained before the operation is started).? Therefore, it is strongly suggested that in order to use this feature with shared configuration datastores, configuration locking SHOULD also be used. > > ? > > We would still like to discuss further the use of the unmodified sr_commit to revert the data. In the unlikely case that a subscriber rejects the previous running datastore state, how should we expect netopeer2 to react? For confirmed commits with a large number of changes, it seems potentially dangerous that a single subscriber could block all changes. If the sr_commit was unsuccessful, how should netopeer2 behave? Right now, we only see two main options. > > * return rpc-error in confirmed-commit response, but hold on to the internal confirmed commit context so the user can try again > > * return rpc-error in confirmed-commit response, and throw away internal confirmed commit context (essentially confirming the commit) > > We would prefer the first option, but this could get tricky if sysrepo returns an error during a timeout or a kill/close-session. Do you have any thoughts on this? > > ? > As you wrote - it is probably clear only for cancel-commit. That is also the only situation when you are able to send rpc-error. Keeping the internal context would mean that the confirmed-commit is still ongoing, so all the limitations on locks (and with internal sysrepo locking also on other operations) would be still active. Maybe keep the context and the confirmed commit ongoing only in the case of failure on cancel-commit. In other cases, just log the errors, remove the context and leave data as it is. One note to reverting changes (and I hope that Michal will confirm this) - instead of getting and storing complete datastore, I think about a feature in sysrepo - snapshots. AFIK, sysrepo translates complex datastore chanhes (copy-config) into a simple transactions using lyd_diff(). So any change of the data is a list of transactions with ability to revert back on sr_commit() failure. By storing the transactions not just between sr_commits(), but between 2 (somehow specified) versions of the datastore context, it would be possible to revert back to a specific snapshot more effectively than by storing datastore copy. But that's just an idea - usable by the confirmed-commit, but not necessary for it. And maybe useful for any sysrepo application, not just the NETCONF server. It seems to ma as a more generic version of confirmed-commit for local use of sysrepo - time to time it is possible to do a snapshot of the datastore for future revert Radek PS: one note regarding storing the datastore for future revert - I have created issue#1117 as a proposal for sysrepo enhancement. It's just idea - usable, but not necessary for your work. > Thanks, > > Adam Weast** > > This message has been classified *Public* by *ADAM WEAST* on Tuesday, May 08, 2018 at 12:59:14 PM. > > ? > > > > _______________________________________________ > sysrepo-devel mailing list > sysrepo-devel at sysrepo.org > http://lists.sysrepo.org/listinfo/sysrepo-devel -- Radek Krejci mobile : +420 732 212 714 office : +420 234 680 256 e-mail : rkrejci at cesnet.cz LinkedIn: http://www.linkedin.com/in/radekkrejci CESNET, Association of Legal Entities Zikova 4 160 00 Praha 6 Czech Republic From mvasko at cesnet.cz Mon May 14 09:11:44 2018 From: mvasko at cesnet.cz (=?utf-8?q?Michal_Va=C5=A1ko?=) Date: Mon, 14 May 2018 11:11:44 +0200 Subject: [sysrepo-devel] =?utf-8?b?Pz09P3V0Zi04P3E/ICBDb25maXJtZWQgQ29t?= =?utf-8?q?mit_Capability_=28RFC_6241_8=2E4=29?= In-Reply-To: <26c62318-8536-7abb-7ae5-36c0665842e4@cesnet.cz> Message-ID: <3bae-5af95300-29-3c89a240@63938217> Hi everyone, my thoughts on this are inline. On Wednesday, May 9, 2018 14:21 CEST, Radek Krej?? wrote: > Hi Adam, > > Dne 8.5.2018 v 19:59 ADAM WEAST napsal(a): > > > > Mikael / Michal / Radek, > > > > ? > > > > Thank you guys for your feedback.? We have thought about this and have come up with a netopeer2 based approach.? Please let us know what you think. > > > > Michal is currently out of the office, so expect feedback next week > > > ? > > > > To implement confirmed commit, we would need to save some state data, possibly with a new struct in common.h. This struct would hold some basic information: > > > > ? > > > > * bool persist - Indicates that this is a persistent confirmed commit or one tied to the life of a session > > > > * nc_session* session - Points to the session that owns the commit if this is not a persistent commit > > > > * char* persist_id - Holds the persistent id if this is a persistent commit > > > > * possibly a mutex for protecting the struct? > > > > ? > > > > A pointer for this struct will be added to the np2srv struct to allow global access. Only one confirmed commit can be active at a time, so it will be sufficient to have a single pointer. A null pointer would indicate that there is no active confirmed commit. > > > > I'm not sure about that "only one" - I'm not able to find the specification text that limits it this way. Actually I'm afraid that since other manipulation (edit-config) with data is allowed, also multiple confirmed commit are formally allowed (despite NETCONF locks should be used which effectively avoid multiple confirmed commits). To be clear, I understand, that limiting it this way really simplify the implementation (and understanding what is happening there), and I vote for such a limitation, I just want to know if this limitation is by specification or by implementation. After rereading the specification, I agree with Radek. I have found no explicit mention of any restriction of concurrent confirmed commits. However, allowing them would most likely cause several problems (if 2 confirmed commits timeout, to what state is finally restored?) while adding no real value to the functionality. So, I support the idea of us imposing this limitation ourselves. > > > ? > > > > Below are the proposed changes to existing netopeer2 operations: > > > > op_commit > > > > ??? if the tag is present: > > > > ??????? if there is already a confirmed commit present (e.g. np2srv->confirmed is not null): > > > > ??????????? if this is not the correct session or the wrong persist_id is used: > > > > ??????????????? return with an error > > > > ??????????? else: > > > > ??????????????? perform the normal sr_copy_config > > > > ??????????????? extend the timer > > > > ??????? else: > > > > ??????????? get high-level access to sysrepo (discussion below) > > > > ??????????? use similar logic to op_getconfig to get a libyang node of the current running datastore > > > > ??????????? use libyang to write the datastore to disk > > > > ??????????? create the confirmed commit struct with the necessary information and save to np2srv > > > > ??????????? restore normal access permissions for sysrepo > > > > ??????????? perform the normal sr_copy_config > > > > ??????????? create and set the timeout timer > > > > ??? else (normal or confirming commit): > > > > ??????? if there is already a confirmed commit present: > > > > ?????????? ?if this is not the correct session or the wrong persist_id is used: > > > > ??????????????? return with an error > > > > ??????????? else: > > > > ??????????????? perform the normal sr_copy_config > > > > ??????????????? delete previous datastore data from the persistent storage > > > > ?????? ?????????delete confirmed commit struct from np2srv > > > > ??????????????? clean up any timer data > > > > ??????? else: > > > > ??????????? perform the normal sr_copy_config > > > > ? > > > > op_cancelcommit (also used for non-persistent session disconnects, timer timeouts, and sessions killed with kill-session) > > > > ??? if there is a confirmed commit present: > > > > ??????? if this is not the correct session or the wrong persist_id is used: > > > > ??????????? return an error > > > > ??????? else: > > > > ??????????? get high-level access to sysrepo (discussion below) > > > > ?????????? ?use similar logic to op_getconfig to get a libyang node of the current running datastore > > > > ??????????? use libyang to load the previous datastore state from persistent storage > > > > ??????????? use lyd_diff to find the difference between previous and current datastore state > > > > ??????????? use logic similar to op_editconfig to commit previous state to running > > > > ??????????? restore normal access permissions > > > > ??????????? delete previous datastore data from the persistent storage > > > > ??????????? delete confirmed commit struct from np2srv > > > > ??????????? clean up any timer data > > > > ??? else: > > > > ??????? return an error > > > > ? > > > > netopeer2 startup > > > > ??? if there is a previous running datastore on persistent storage: > > > > ??????? // assume netopeer crashed in the middle of a confirmed commit > > > > ??????? // we need to revert it > > > > ??????? create temporary session to sysrepo with NACM disabled > > > > ??????? use similar logic to op_getconfig to get a libyang node of the current running datastore > > > > ??????? use libyang to load the previous datastore state from persistent storage > > > > ??????? use lyd_diff to find the difference between previous and current datastore state > > > > ??????? use logic similar to op_editconfig to commit previous state to running > > > > ??????? close temporary session > > > > ? > > > > op_lock > > > > When a user attempts to lock the running datastore, we will add an additional check to ensure there is not an active confirmed commit before granting the lock. > > > > ? > > > > Caveats: > > > > ? > > > > To save a backup of the running datastore and revert it successfully, we need a way to access all datastore nodes regardless of the permission level of the current user. This corresponds with the sections of the psuedo-code above where we 'get high-level access to sysrepo'. There are specific use cases we can see that would require this. Assume low-privilege user A starts a persistent confirmed commit. High-privilege user B could add a follow-up confirmed commit with changes to data that A would normally not have access to. If A issues a cancel-commit, we would expect the datastore to return to its previous state, including any hidden changes B made. To do this, netopeer2 needs full read and write access to the entire running datastore regardless of the privilege level of the user issuing the initial confirmed commit or cancel commit commands. We have a couple of ideas, but we would like your input regarding what you believe would be the best way. > > > > * use sr_session_set_options to disable NACM on the current session. This may not be sufficient however due to the file ACLs on sysrepo's repository files. > > > > * create a temporary root (or highest-privilege possible) session into sysrepo. This should give us access to all modules regardless of who installed them into sysrepo. > > > > Beyond these ideas, we cannot see a way to backup and restore all data to and from sysrepo without adding new Client Library calls to sysrepo itself. > > > > I believe there is already such an internal netopeer's session (used e.g. to get schemas from sysrepo) - np2srv.sr_sess in main.c. > Yes, just use this session and it should work as per your description. > > ? > > > > We also want to point out that we can only correctly manage locks on the running datastore if netopeer2 is the only client connected to sysrepo. According to the NETCONF RFC, users are not required to lock the datastore before starting a confirmed commit. However, if the datastore is not locked during an active confirmed commit, users cannot get a new lock on the running datastore until the commit is confirmed or cancelled. All of this logic would be managed within netopeer2 in this proposal. If a second, non-netopeer2 client connected to sysrepo, it could place a sysrepo lock on the running datastore during a netopeer2 confirmed commit. This would leave netopeer2 in a bad state if it attempted to cancel the ongoing confirmed commit. We personally do not foresee a need to have a second client connected to sysrepo, but we wanted to make sure you were aware of this possible limitation going forward. > > > > As Michal wrote, I'm for using sr_lock_datastore() when starting confirmed commit in netopeer2-server. To your note about persist and actually splitting commit and confirm between 2 NETCONF sessions, what about doing the lock on that netopeer's internal sysrepo session? I'm not sure (will wait for Michal for his view), because the change itself must be done on the NETCONF client's sysrepo session (because of NACM), while this locking (granting access only to netopeer2) would be done in the netopeer's internal session (to make it persistent) - I believe it is doable despite it is not straightforward. And it also solves the problem with releasing this lock when NETCONF session is terminated. > To me this seems like a reasonable approach because the confirmed commit locks should not be released immediately after the issuing session terminates, the commit should be cancelled first (unless persistent, then the locks should actually remain in place). Nevertheless, standard NETCONF locks would need a bit of extra work because if a session locks a datastore and then starts a confirmed commit (or the other way around), both operations should succeed even though both require datastore locks. > > In Michal's last response, it was suggested to implicitly lock the running datastore at the beginning of every confirmed commit. We were afraid this might interfere with confirmed commits with persist-ids. In this situation, we would expect a second netopeer2 session to be allowed to perform commits on the running > > > > actually this is the question - specification actually "strongly suggest" to use (NETCONF) locking by (NETCONF) clients. With the approach proposed by Michal, the locking is actually forced. Respectively, the changes with ongoing confirmed commit are prohibited. The question is if we are fine with such an implementation. I am fine, because otherwise it would really complicated implementation and introduced the problems you have mentioned. What other developers thinks? > > > datastore, regardless of which netopeer2 session started the confirmed commit. During a persistent confirmed commit, the original user may also disconnect, which would normally drop any sysrepo locks associated with the session. Also, based on the snippet of the NETCONF RFC below, we thought that any user could still issue a normal edit-config if the datastore is unlocked, but those changes have the potential to be removed or altered during a cancel commit. We understood this to be a strong suggestion for the user of the system to place a lock before starting a confirming commit, not as a requirement of the implementation to do it automatically. Please let us know if you feel differently. > > > > I think it is more recommendation for the client perforimg those edit-config changes - for them the lock will not be granted because of the ongoing confirmed commit. And as you wrote, in case of NETCONF locking before confirmed commit, in case of closing the session (which can be expected in case of using confirmed commit, e.g. because of changing management interface configuration), the NETCONF lock will be automatically released. > > I would like to have the implementation without internal sysrepo locking and allowing edit-config while there is ongoing confirmed commit. But I'm a little skeptic about such an ultimate solution in case you don't have full control of the data access. Yes, this is the reason for your initial idea to implement it inside sysrepo. Anyway, I would still prefer the more simpe implementation in netopeer2, which does not cover all the cases / does not allow all the possibilities mentioned in specification. It is more strict than specification, but just to be more safe (avoid inconsistency states) and I don't thing the limitation is critical in this case. So, the confirmed commit would be implemented as an operation that takes some (specified) time and during this time it is not possible to manipulate with the datastore content. Is it fine for everyone? > Seems to me we are gaining much more then losing, so I support this. Even though it could be a bit limiting when, for example, a session starts a persist confirmed commit and then terminates. Unless some other session finishes/cancels the confirmed commit, no other session could modify until the timeout expires. But this may actually be a feature, not a bug. > > ? > > > > Snippet from RFC 6241 section 8.4.1: > > > > For shared configurations, this feature can cause other configuration changes (for example, via other NETCONF sessions) to be inadvertently altered or removed, unless the configuration locking feature is used (in other words, the lock is obtained before the operation is started).? Therefore, it is strongly suggested that in order to use this feature with shared configuration datastores, configuration locking SHOULD also be used. > > > > ? > > > > We would still like to discuss further the use of the unmodified sr_commit to revert the data. In the unlikely case that a subscriber rejects the previous running datastore state, how should we expect netopeer2 to react? For confirmed commits with a large number of changes, it seems potentially dangerous that a single subscriber could block all changes. If the sr_commit was unsuccessful, how should netopeer2 behave? Right now, we only see two main options. > > > > * return rpc-error in confirmed-commit response, but hold on to the internal confirmed commit context so the user can try again > > > > * return rpc-error in confirmed-commit response, and throw away internal confirmed commit context (essentially confirming the commit) > > > > We would prefer the first option, but this could get tricky if sysrepo returns an error during a timeout or a kill/close-session. Do you have any thoughts on this? > > > > ? > > > > As you wrote - it is probably clear only for cancel-commit. That is also the only situation when you are able to send rpc-error. Keeping the internal context would mean that the confirmed-commit is still ongoing, so all the limitations on locks (and with internal sysrepo locking also on other operations) would be still active. Maybe keep the context and the confirmed commit ongoing only in the case of failure on cancel-commit. In other cases, just log the errors, remove the context and leave data as it is. > > One note to reverting changes (and I hope that Michal will confirm this) - instead of getting and storing complete datastore, I think about a feature in sysrepo - snapshots. AFIK, sysrepo translates complex datastore chanhes (copy-config) into a simple transactions using lyd_diff(). So any change of the data is a list of transactions with ability to revert back on sr_commit() failure. By storing the transactions not just between sr_commits(), but between 2 (somehow specified) versions of the datastore context, it would be possible to revert back to a specific snapshot more effectively than by storing datastore copy. But that's just an idea - usable by the confirmed-commit, but not necessary for it. And maybe useful for any sysrepo application, not just the NETCONF server. It seems to ma as a more generic version of confirmed-commit for local use of sysrepo - time to time it is possible to do a snapshot of the datastore for future revert > Yes, this seems possible and as long as no modifications to would be allowed as long as there is an ongoing confirmed-commit, it should work fine. However, we need to decide how to handle netopeer2-server and (more importantly) sysrepod terminations (whether a standard one or a crash). If we want to be able to cancel a confirmed commit after netopeer2-server/sysrepod crash, some information would need to be persistent. In case of sysrepo snapshots, those simple transactions would need a format for storing in a file. Also, the time the confirmed commit was issued, the confirmed commit timeout (so that it can be learnt when to timeout the commit), and perhaps also something else will need to be saved persistently. I personally think this feature is required, what do others think? Michal > Radek > > PS: one note regarding storing the datastore for future revert - I have created issue#1117 as a proposal for sysrepo enhancement. It's just idea - usable, but not necessary for your work. > > > > Thanks, > > > > Adam Weast** > > > > This message has been classified *Public* by *ADAM WEAST* on Tuesday, May 08, 2018 at 12:59:14 PM. > > > > ? > > > > > > > > _______________________________________________ > > sysrepo-devel mailing list > > sysrepo-devel at sysrepo.org > > http://lists.sysrepo.org/listinfo/sysrepo-devel > > -- Radek Krejci > mobile : +420 732 212 714 > office : +420 234 680 256 > e-mail : rkrejci at cesnet.cz > LinkedIn: http://www.linkedin.com/in/radekkrejci > > CESNET, Association of Legal Entities > Zikova 4 > 160 00 Praha 6 > Czech Republic > > > _______________________________________________ > sysrepo-devel mailing list > sysrepo-devel at sysrepo.org > http://lists.sysrepo.org/listinfo/sysrepo-devel From mvasko at cesnet.cz Wed May 23 13:35:33 2018 From: mvasko at cesnet.cz (=?utf-8?q?Michal_Va=C5=A1ko?=) Date: Wed, 23 May 2018 15:35:33 +0200 Subject: [sysrepo-devel] sysrepo optimizations Message-ID: <746a-5b056e00-2d-3a3efa80@11285140> Hi everyone, we have finished some libyang data tree optimizations [1] primarily meant to enhance sysrepo performance. So, now it should be possible to work (edit/get) with quite large data trees in reasonable time. I have tested on a data tree with 40 000 list instances and over 700 000 individual data nodes. took ~16s and some smaller ~25s. I am writing to ask anyone able to test the current libyang and sysrepo devel state and provide feedback so that we can make a stable and tested release some time next week. Thank you. Regards, Michal [1] https://github.com/CESNET/libyang/issues/508