[cf-dev] SSH access to CF app instances on Diego

classic Classic list List threaded Threaded
21 messages Options
12
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

[cf-dev] SSH access to CF app instances on Diego

Eric Malm
Dear CF community,

I'm pleased to announce that the Diego team is nearing completion of our initial track of work to enable SSH access to app instances running on Diego. We've recently published some preliminary versions of the Diego-SSH CLI plugin at https://github.com/cloudfoundry-incubator/diego-ssh/releases, and I've posted some instructions for developers and operators in the Diego Design Notes, at https://github.com/cloudfoundry-incubator/diego-design-notes/blob/master/ssh-access-and-policy.md.

For a CF+Diego deployment with SSH already enabled and publicly routable, accessing a CF app instance over SSH is as simple as running cf ssh <app-name> after installing the CLI plugin. The plugin also supports a -i option to target other instance indices and a -L option to forward local ports, and we expect to add a cf scp command soon as well for bidirectional file transfer.

It's also possible to use the built-in ssh and scp clients on Linux and OS X to access instances. For more details on how to use those clients, please see the design notes linked above or the more extensive documentation for the CF and Diego authenticators on the Diego-SSH repository: https://github.com/cloudfoundry-incubator/diego-ssh.

If you're using the steps in the diego-release README to deploy it and CF to BOSH-Lite, this SSH functionality should all just work for you automatically. If you're operating on a different infrastructure, you will have a few values to configure in your BOSH deployment manifests, and, depending on your environment, a load balancer to provision in front of Diego's SSH proxies. Again, the design notes cover the basic manifest properties and infrastructure setup required to give access to CF app instances.

Version 0.1.2 of the Diego-SSH plugin also provides commands for adjusting the SSH policy on CF apps and spaces through Cloud Controller. In particular, users can employ those commands to enable and disable SSH both on an individual app and on an entire space. Operators can separately choose whether to allow SSH access via a BOSH-configurable parameter in Cloud Controller's configuration.

Finally, we realize that unfettered SSH access to instances is something of a double-edged sword: while it enables interactive inspection and experimentation inside an app instance, it also makes it possible to build 'snowflake' instances that will melt whenever the platform restarts them. To prevent the creation of such snowflakes, we propose to implement a restart policy for CF app instances: after executing a command, concluding an interactive session, or copying a file into an instance, that instance will be restarted. Forwarding ports and copying files out will not trigger a restart, though. Cloud Controller admins will be able to opt individual spaces out of this restart behavior, to allow developers greater inspection of and experimentation on their apps in those spaces. We haven't yet introduced this behavior, but it's among the last pieces of work we intend to do before we consider this initial batch of SSH work complete.

If you're interested in the progress of the remaining stories we currently have planned for SSH access, please look for them in the Diego tracker with the 'ssh' label.


Thanks,
Eric Malm, CF Runtime Diego PM

_______________________________________________
cf-dev mailing list
[hidden email]
https://lists.cloudfoundry.org/mailman/listinfo/cf-dev
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [cf-dev] SSH access to CF app instances on Diego

Jan Dubois
On Thu, Jun 25, 2015 at 5:36 PM, Eric Malm <[hidden email]> wrote:
> after executing a command, concluding an
> interactive session, or copying a file into an instance, that instance will
> be restarted.

What is the purpose of being able to copy a file into an instance if
the instance is restarted as soon as the file has been received?

Cheers,
-Jan
_______________________________________________
cf-dev mailing list
[hidden email]
https://lists.cloudfoundry.org/mailman/listinfo/cf-dev
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [cf-dev] SSH access to CF app instances on Diego

James Bayer
you can turn the "restart tainted containers" feature off with configuration if you are authorized to do so. then using scp to write files into a container would be persisted for the lifetime of the container even after the ssh session ends.

On Thu, Jun 25, 2015 at 5:50 PM, Jan Dubois <[hidden email]> wrote:
On Thu, Jun 25, 2015 at 5:36 PM, Eric Malm <[hidden email]> wrote:
> after executing a command, concluding an
> interactive session, or copying a file into an instance, that instance will
> be restarted.

What is the purpose of being able to copy a file into an instance if
the instance is restarted as soon as the file has been received?

Cheers,
-Jan
_______________________________________________
cf-dev mailing list
[hidden email]
https://lists.cloudfoundry.org/mailman/listinfo/cf-dev



--
Thank you,

James Bayer

_______________________________________________
cf-dev mailing list
[hidden email]
https://lists.cloudfoundry.org/mailman/listinfo/cf-dev
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [cf-dev] SSH access to CF app instances on Diego

Matthew Sykes
My concern is the default behavior.

When I first prototyped this support in February, I never expected that merely accessing a container would cause it to be terminated. As we can see from Jan's response, it's completely unexpected; many others have the same reaction.

I do not believe that this behavior should be part of the default configuration and I do believe the control needs to be at the space level. I have have already expressed this opinion during Diego retros and at the runtime PMC meeting.

I honestly believe that if we were talking about applying this behavior to `bosh ssh` and `bosh scp`, few would even consider running in a 'kill on taint mode' because of how useful it is. We should learn from that.

If this behavior becomes the default, I think our platform will be seen as moving from opinionated to parochial. That would be unfortunate.


On Thu, Jun 25, 2015 at 6:05 PM, James Bayer <[hidden email]> wrote:
you can turn the "restart tainted containers" feature off with configuration if you are authorized to do so. then using scp to write files into a container would be persisted for the lifetime of the container even after the ssh session ends.

On Thu, Jun 25, 2015 at 5:50 PM, Jan Dubois <[hidden email]> wrote:
On Thu, Jun 25, 2015 at 5:36 PM, Eric Malm <[hidden email]> wrote:
> after executing a command, concluding an
> interactive session, or copying a file into an instance, that instance will
> be restarted.

What is the purpose of being able to copy a file into an instance if
the instance is restarted as soon as the file has been received?

Cheers,
-Jan
_______________________________________________
cf-dev mailing list
[hidden email]
https://lists.cloudfoundry.org/mailman/listinfo/cf-dev



--
Thank you,

James Bayer

_______________________________________________
cf-dev mailing list
[hidden email]
https://lists.cloudfoundry.org/mailman/listinfo/cf-dev




--
Matthew Sykes
[hidden email]

_______________________________________________
cf-dev mailing list
[hidden email]
https://lists.cloudfoundry.org/mailman/listinfo/cf-dev
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [cf-dev] SSH access to CF app instances on Diego

James Bayer
thanks for sharing your view matt. i happen to disagree. i've talked to many more conservative enterprise operations people and they really don't want to enable snowflakes by default.

since we plan to have both global and per-space/per-app configuration options, administrators can make the choice for each installation if they want to enable writable containers without tainted container recycling. each vendor that distributes cloud foundry could have their own opinionated default.

i'd love to hear from others in the oss community what the default should be in cf-release. i can tell you that the vast majority of customers and people i've spoken with feel it should be "recycle tainted containers by default" and only able to be turned off with an exception to the rule from an administrator.

On Thu, Jun 25, 2015 at 9:29 PM, Matthew Sykes <[hidden email]> wrote:
My concern is the default behavior.

When I first prototyped this support in February, I never expected that merely accessing a container would cause it to be terminated. As we can see from Jan's response, it's completely unexpected; many others have the same reaction.

I do not believe that this behavior should be part of the default configuration and I do believe the control needs to be at the space level. I have have already expressed this opinion during Diego retros and at the runtime PMC meeting.

I honestly believe that if we were talking about applying this behavior to `bosh ssh` and `bosh scp`, few would even consider running in a 'kill on taint mode' because of how useful it is. We should learn from that.

If this behavior becomes the default, I think our platform will be seen as moving from opinionated to parochial. That would be unfortunate.


On Thu, Jun 25, 2015 at 6:05 PM, James Bayer <[hidden email]> wrote:
you can turn the "restart tainted containers" feature off with configuration if you are authorized to do so. then using scp to write files into a container would be persisted for the lifetime of the container even after the ssh session ends.

On Thu, Jun 25, 2015 at 5:50 PM, Jan Dubois <[hidden email]> wrote:
On Thu, Jun 25, 2015 at 5:36 PM, Eric Malm <[hidden email]> wrote:
> after executing a command, concluding an
> interactive session, or copying a file into an instance, that instance will
> be restarted.

What is the purpose of being able to copy a file into an instance if
the instance is restarted as soon as the file has been received?

Cheers,
-Jan
_______________________________________________
cf-dev mailing list
[hidden email]
https://lists.cloudfoundry.org/mailman/listinfo/cf-dev



--
Thank you,

James Bayer

_______________________________________________
cf-dev mailing list
[hidden email]
https://lists.cloudfoundry.org/mailman/listinfo/cf-dev




--
Matthew Sykes
[hidden email]

_______________________________________________
cf-dev mailing list
[hidden email]
https://lists.cloudfoundry.org/mailman/listinfo/cf-dev




--
Thank you,

James Bayer

_______________________________________________
cf-dev mailing list
[hidden email]
https://lists.cloudfoundry.org/mailman/listinfo/cf-dev
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [cf-dev] SSH access to CF app instances on Diego

Benjamin Black
In reply to this post by Matthew Sykes
matt,

could you elaborate a bit on what you believe ssh access to instances is for? 


b


On Thu, Jun 25, 2015 at 9:29 PM, Matthew Sykes <[hidden email]> wrote:
My concern is the default behavior.

When I first prototyped this support in February, I never expected that merely accessing a container would cause it to be terminated. As we can see from Jan's response, it's completely unexpected; many others have the same reaction.

I do not believe that this behavior should be part of the default configuration and I do believe the control needs to be at the space level. I have have already expressed this opinion during Diego retros and at the runtime PMC meeting.

I honestly believe that if we were talking about applying this behavior to `bosh ssh` and `bosh scp`, few would even consider running in a 'kill on taint mode' because of how useful it is. We should learn from that.

If this behavior becomes the default, I think our platform will be seen as moving from opinionated to parochial. That would be unfortunate.


On Thu, Jun 25, 2015 at 6:05 PM, James Bayer <[hidden email]> wrote:
you can turn the "restart tainted containers" feature off with configuration if you are authorized to do so. then using scp to write files into a container would be persisted for the lifetime of the container even after the ssh session ends.

On Thu, Jun 25, 2015 at 5:50 PM, Jan Dubois <[hidden email]> wrote:
On Thu, Jun 25, 2015 at 5:36 PM, Eric Malm <[hidden email]> wrote:
> after executing a command, concluding an
> interactive session, or copying a file into an instance, that instance will
> be restarted.

What is the purpose of being able to copy a file into an instance if
the instance is restarted as soon as the file has been received?

Cheers,
-Jan
_______________________________________________
cf-dev mailing list
[hidden email]
https://lists.cloudfoundry.org/mailman/listinfo/cf-dev



--
Thank you,

James Bayer

_______________________________________________
cf-dev mailing list
[hidden email]
https://lists.cloudfoundry.org/mailman/listinfo/cf-dev




--
Matthew Sykes
[hidden email]

_______________________________________________
cf-dev mailing list
[hidden email]
https://lists.cloudfoundry.org/mailman/listinfo/cf-dev



_______________________________________________
cf-dev mailing list
[hidden email]
https://lists.cloudfoundry.org/mailman/listinfo/cf-dev
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [cf-dev] SSH access to CF app instances on Diego

aaron_huber
Administrator
In reply to this post by James Bayer
I can certainly confirm that for Intel this feature would be required for us to use ssh/scp access at all, and maybe not even then.  We've been selling Cloud Foundry to our security folks as a huge improvement in app security specifically because developers don't have access to the containers and they don't need to be "system admins".  Enabling this feature goes a long way towards unwinding that and having this extra control might give us some wiggle room to enable it.  In any case, the default configuration should always be the most secure and it can be easily configured off if desired.

I think it's been said before, but I also strongly feel that "cf files" needs to continue to work in Diego even if ssh/scp access is disabled.  If we're not allowed to enable ssh/scp access and "cf files" goes away then we've effectively lost all access to the containers for developers which will be frustrating.  If "cf files" is going away, then we'd need some way to enforce that the only access allowed is read-only.  Even destroying the container after the ssh session ends may not be good enough - an argument can easily be made that a malicious user could keep the ssh session open after intentionally modifying the container in some way.

Aaron Huber
Intel Corporation
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [cf-dev] SSH access to CF app instances on Diego

Matt Cowger
In reply to this post by Benjamin Black
>we propose to implement a restart policy for CF app instances: after executing a command, concluding an interactive session, or copying a file into an instance, that instance will be restarted.

I have to agree with Matt S here - having this as default behavior is rough....having an instance restart automatically after running a command, ending a session (what if that session end was accidental, or caused by external network problems) or upon copying a file in (which presumably had a reason to be used) would certainly fall into the 'unexpected' category for someone who doesn't follow CF development closely.

I totally understand the argument about tainted containers being snowflakes (hugely dangerous in and of itself), and I wouldn't want to see them stick around forever either.

Some alternative thoughts:

* Upon tainting a container, add a scheduled task that recycles the container in N hours unless some action is take (like issuing another tainted command)
* Declaring a warning (MOTD style) on login that the following sorts of commands will result in instant recycle upon logout
* Combined with above - automatically recycling a container after N hours or logout unless a given file (~/dont_tase_me) exists*

I think something like the above prevents the 'magic' effects that feel dangerous and, as Matt suggested, somewhat parochial.  They also require the instance manager to make active efforts to prevent recycling, thus hopefully preventing some of the self-induced snowflake effect.

*reminds me of my favorite old VMS command: FORCE_DISMOUNT_USE_WITH_EXTREME_CAUTION (yes, that was the whole command you actually had to type).

--Matt Cowger


On Thu, Jun 25, 2015 at 10:53 PM, Benjamin Black <[hidden email]> wrote:
matt,

could you elaborate a bit on what you believe ssh access to instances is for? 


b


On Thu, Jun 25, 2015 at 9:29 PM, Matthew Sykes <[hidden email]> wrote:
My concern is the default behavior.

When I first prototyped this support in February, I never expected that merely accessing a container would cause it to be terminated. As we can see from Jan's response, it's completely unexpected; many others have the same reaction.

I do not believe that this behavior should be part of the default configuration and I do believe the control needs to be at the space level. I have have already expressed this opinion during Diego retros and at the runtime PMC meeting.

I honestly believe that if we were talking about applying this behavior to `bosh ssh` and `bosh scp`, few would even consider running in a 'kill on taint mode' because of how useful it is. We should learn from that.

If this behavior becomes the default, I think our platform will be seen as moving from opinionated to parochial. That would be unfortunate.


On Thu, Jun 25, 2015 at 6:05 PM, James Bayer <[hidden email]> wrote:
you can turn the "restart tainted containers" feature off with configuration if you are authorized to do so. then using scp to write files into a container would be persisted for the lifetime of the container even after the ssh session ends.

On Thu, Jun 25, 2015 at 5:50 PM, Jan Dubois <[hidden email]> wrote:
On Thu, Jun 25, 2015 at 5:36 PM, Eric Malm <[hidden email]> wrote:
> after executing a command, concluding an
> interactive session, or copying a file into an instance, that instance will
> be restarted.

What is the purpose of being able to copy a file into an instance if
the instance is restarted as soon as the file has been received?

Cheers,
-Jan
_______________________________________________
cf-dev mailing list
[hidden email]
https://lists.cloudfoundry.org/mailman/listinfo/cf-dev



--
Thank you,

James Bayer

_______________________________________________
cf-dev mailing list
[hidden email]
https://lists.cloudfoundry.org/mailman/listinfo/cf-dev




--
Matthew Sykes
[hidden email]

_______________________________________________
cf-dev mailing list
[hidden email]
https://lists.cloudfoundry.org/mailman/listinfo/cf-dev



_______________________________________________
cf-dev mailing list
[hidden email]
https://lists.cloudfoundry.org/mailman/listinfo/cf-dev




--
-- Matt

_______________________________________________
cf-dev mailing list
[hidden email]
https://lists.cloudfoundry.org/mailman/listinfo/cf-dev
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [cf-dev] SSH access to CF app instances on Diego

Dan Wendorf
It feels like the right behavior, but also very unexpected. My vote would be to enable it by default, but as Matt suggests, make clear to the user that their actions will have unrequested consequences. Users will be trying to use SSH with expected patterns, including the assumption of longevity.

I don't think it's a tough sell that recycling is a good idea, but that sale still needs to be made to each user.

On Thursday, June 25, 2015, Matt Cowger <[hidden email]> wrote:
>we propose to implement a restart policy for CF app instances: after executing a command, concluding an interactive session, or copying a file into an instance, that instance will be restarted.

I have to agree with Matt S here - having this as default behavior is rough....having an instance restart automatically after running a command, ending a session (what if that session end was accidental, or caused by external network problems) or upon copying a file in (which presumably had a reason to be used) would certainly fall into the 'unexpected' category for someone who doesn't follow CF development closely.

I totally understand the argument about tainted containers being snowflakes (hugely dangerous in and of itself), and I wouldn't want to see them stick around forever either.

Some alternative thoughts:

* Upon tainting a container, add a scheduled task that recycles the container in N hours unless some action is take (like issuing another tainted command)
* Declaring a warning (MOTD style) on login that the following sorts of commands will result in instant recycle upon logout
* Combined with above - automatically recycling a container after N hours or logout unless a given file (~/dont_tase_me) exists*

I think something like the above prevents the 'magic' effects that feel dangerous and, as Matt suggested, somewhat parochial.  They also require the instance manager to make active efforts to prevent recycling, thus hopefully preventing some of the self-induced snowflake effect.

*reminds me of my favorite old VMS command: FORCE_DISMOUNT_USE_WITH_EXTREME_CAUTION (yes, that was the whole command you actually had to type).

--Matt Cowger


On Thu, Jun 25, 2015 at 10:53 PM, Benjamin Black <<a href="javascript:_e(%7B%7D,&#39;cvml&#39;,&#39;bblack@pivotal.io&#39;);" target="_blank">bblack@...> wrote:
matt,

could you elaborate a bit on what you believe ssh access to instances is for? 


b


On Thu, Jun 25, 2015 at 9:29 PM, Matthew Sykes <<a href="javascript:_e(%7B%7D,&#39;cvml&#39;,&#39;matthew.sykes@gmail.com&#39;);" target="_blank">matthew.sykes@...> wrote:
My concern is the default behavior.

When I first prototyped this support in February, I never expected that merely accessing a container would cause it to be terminated. As we can see from Jan's response, it's completely unexpected; many others have the same reaction.

I do not believe that this behavior should be part of the default configuration and I do believe the control needs to be at the space level. I have have already expressed this opinion during Diego retros and at the runtime PMC meeting.

I honestly believe that if we were talking about applying this behavior to `bosh ssh` and `bosh scp`, few would even consider running in a 'kill on taint mode' because of how useful it is. We should learn from that.

If this behavior becomes the default, I think our platform will be seen as moving from opinionated to parochial. That would be unfortunate.


On Thu, Jun 25, 2015 at 6:05 PM, James Bayer <<a href="javascript:_e(%7B%7D,&#39;cvml&#39;,&#39;jbayer@pivotal.io&#39;);" target="_blank">jbayer@...> wrote:
you can turn the "restart tainted containers" feature off with configuration if you are authorized to do so. then using scp to write files into a container would be persisted for the lifetime of the container even after the ssh session ends.

On Thu, Jun 25, 2015 at 5:50 PM, Jan Dubois <<a href="javascript:_e(%7B%7D,&#39;cvml&#39;,&#39;jand@activestate.com&#39;);" target="_blank">jand@...> wrote:
On Thu, Jun 25, 2015 at 5:36 PM, Eric Malm <<a href="javascript:_e(%7B%7D,&#39;cvml&#39;,&#39;emalm@pivotal.io&#39;);" target="_blank">emalm@...> wrote:
> after executing a command, concluding an
> interactive session, or copying a file into an instance, that instance will
> be restarted.

What is the purpose of being able to copy a file into an instance if
the instance is restarted as soon as the file has been received?

Cheers,
-Jan
_______________________________________________
cf-dev mailing list
<a href="javascript:_e(%7B%7D,&#39;cvml&#39;,&#39;cf-dev@lists.cloudfoundry.org&#39;);" target="_blank">cf-dev@...
https://lists.cloudfoundry.org/mailman/listinfo/cf-dev



--
Thank you,

James Bayer

_______________________________________________
cf-dev mailing list
<a href="javascript:_e(%7B%7D,&#39;cvml&#39;,&#39;cf-dev@lists.cloudfoundry.org&#39;);" target="_blank">cf-dev@...
https://lists.cloudfoundry.org/mailman/listinfo/cf-dev




--
Matthew Sykes
<a href="javascript:_e(%7B%7D,&#39;cvml&#39;,&#39;matthew.sykes@gmail.com&#39;);" target="_blank">matthew.sykes@...

_______________________________________________
cf-dev mailing list
<a href="javascript:_e(%7B%7D,&#39;cvml&#39;,&#39;cf-dev@lists.cloudfoundry.org&#39;);" target="_blank">cf-dev@...
https://lists.cloudfoundry.org/mailman/listinfo/cf-dev



_______________________________________________
cf-dev mailing list
<a href="javascript:_e(%7B%7D,&#39;cvml&#39;,&#39;cf-dev@lists.cloudfoundry.org&#39;);" target="_blank">cf-dev@...
https://lists.cloudfoundry.org/mailman/listinfo/cf-dev




--
-- Matt

_______________________________________________
cf-dev mailing list
[hidden email]
https://lists.cloudfoundry.org/mailman/listinfo/cf-dev
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [cf-dev] SSH access to CF app instances on Diego

Cornelia Davis
On behalf of many customers that I have spoken to, the default behavior of disposing of any container that could be tainted is the right choice. Not providing rope is a huge feature of the platform. If you want to be dangerous, it should be hard to do so. And as James has explained, it is possible.

On Fri, Jun 26, 2015 at 8:59 AM, Dan Wendorf <[hidden email]> wrote:
It feels like the right behavior, but also very unexpected. My vote would be to enable it by default, but as Matt suggests, make clear to the user that their actions will have unrequested consequences. Users will be trying to use SSH with expected patterns, including the assumption of longevity.

I don't think it's a tough sell that recycling is a good idea, but that sale still needs to be made to each user.


On Thursday, June 25, 2015, Matt Cowger <[hidden email]> wrote:
>we propose to implement a restart policy for CF app instances: after executing a command, concluding an interactive session, or copying a file into an instance, that instance will be restarted.

I have to agree with Matt S here - having this as default behavior is rough....having an instance restart automatically after running a command, ending a session (what if that session end was accidental, or caused by external network problems) or upon copying a file in (which presumably had a reason to be used) would certainly fall into the 'unexpected' category for someone who doesn't follow CF development closely.

I totally understand the argument about tainted containers being snowflakes (hugely dangerous in and of itself), and I wouldn't want to see them stick around forever either.

Some alternative thoughts:

* Upon tainting a container, add a scheduled task that recycles the container in N hours unless some action is take (like issuing another tainted command)
* Declaring a warning (MOTD style) on login that the following sorts of commands will result in instant recycle upon logout
* Combined with above - automatically recycling a container after N hours or logout unless a given file (~/dont_tase_me) exists*

I think something like the above prevents the 'magic' effects that feel dangerous and, as Matt suggested, somewhat parochial.  They also require the instance manager to make active efforts to prevent recycling, thus hopefully preventing some of the self-induced snowflake effect.

*reminds me of my favorite old VMS command: FORCE_DISMOUNT_USE_WITH_EXTREME_CAUTION (yes, that was the whole command you actually had to type).

--Matt Cowger


On Thu, Jun 25, 2015 at 10:53 PM, Benjamin Black <[hidden email]> wrote:
matt,

could you elaborate a bit on what you believe ssh access to instances is for? 


b


On Thu, Jun 25, 2015 at 9:29 PM, Matthew Sykes <[hidden email]> wrote:
My concern is the default behavior.

When I first prototyped this support in February, I never expected that merely accessing a container would cause it to be terminated. As we can see from Jan's response, it's completely unexpected; many others have the same reaction.

I do not believe that this behavior should be part of the default configuration and I do believe the control needs to be at the space level. I have have already expressed this opinion during Diego retros and at the runtime PMC meeting.

I honestly believe that if we were talking about applying this behavior to `bosh ssh` and `bosh scp`, few would even consider running in a 'kill on taint mode' because of how useful it is. We should learn from that.

If this behavior becomes the default, I think our platform will be seen as moving from opinionated to parochial. That would be unfortunate.


On Thu, Jun 25, 2015 at 6:05 PM, James Bayer <[hidden email]> wrote:
you can turn the "restart tainted containers" feature off with configuration if you are authorized to do so. then using scp to write files into a container would be persisted for the lifetime of the container even after the ssh session ends.

On Thu, Jun 25, 2015 at 5:50 PM, Jan Dubois <[hidden email]> wrote:
On Thu, Jun 25, 2015 at 5:36 PM, Eric Malm <[hidden email]> wrote:
> after executing a command, concluding an
> interactive session, or copying a file into an instance, that instance will
> be restarted.

What is the purpose of being able to copy a file into an instance if
the instance is restarted as soon as the file has been received?

Cheers,
-Jan
_______________________________________________
cf-dev mailing list
[hidden email]
https://lists.cloudfoundry.org/mailman/listinfo/cf-dev



--
Thank you,

James Bayer

_______________________________________________
cf-dev mailing list
[hidden email]
https://lists.cloudfoundry.org/mailman/listinfo/cf-dev




--
Matthew Sykes
[hidden email]

_______________________________________________
cf-dev mailing list
[hidden email]
https://lists.cloudfoundry.org/mailman/listinfo/cf-dev



_______________________________________________
cf-dev mailing list
[hidden email]
https://lists.cloudfoundry.org/mailman/listinfo/cf-dev




--
-- Matt

_______________________________________________
cf-dev mailing list
[hidden email]
https://lists.cloudfoundry.org/mailman/listinfo/cf-dev



_______________________________________________
cf-dev mailing list
[hidden email]
https://lists.cloudfoundry.org/mailman/listinfo/cf-dev
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [cf-dev] SSH access to CF app instances on Diego

Jan Dubois
In reply to this post by James Bayer
I would rather that scp would throw an error than having it copy the
file and then immediately kill the instance. I still don't understand
how that would ever be useful.

Or would having a simultaneous ssh session to the same container keep
it alive even after the scp session ended?

I have no problem with not allowing scp into app instances by default.

Cheers,
-Jan

On Thu, Jun 25, 2015 at 6:05 PM, James Bayer <[hidden email]> wrote:

> you can turn the "restart tainted containers" feature off with configuration
> if you are authorized to do so. then using scp to write files into a
> container would be persisted for the lifetime of the container even after
> the ssh session ends.
>
> On Thu, Jun 25, 2015 at 5:50 PM, Jan Dubois <[hidden email]> wrote:
>>
>> On Thu, Jun 25, 2015 at 5:36 PM, Eric Malm <[hidden email]> wrote:
>> > after executing a command, concluding an
>> > interactive session, or copying a file into an instance, that instance
>> > will
>> > be restarted.
>>
>> What is the purpose of being able to copy a file into an instance if
>> the instance is restarted as soon as the file has been received?
>>
>> Cheers,
>> -Jan
>> _______________________________________________
>> cf-dev mailing list
>> [hidden email]
>> https://lists.cloudfoundry.org/mailman/listinfo/cf-dev
>
>
>
>
> --
> Thank you,
>
> James Bayer
>
> _______________________________________________
> cf-dev mailing list
> [hidden email]
> https://lists.cloudfoundry.org/mailman/listinfo/cf-dev
>
_______________________________________________
cf-dev mailing list
[hidden email]
https://lists.cloudfoundry.org/mailman/listinfo/cf-dev
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [cf-dev] SSH access to CF app instances on Diego

Matthew Sykes
In reply to this post by Benjamin Black
Depends on your role and where your app is in the deployment pipeline. Most of the scenarios I envisioned were for the tail end of development where you need to poke around to debug and figure out those last few problems.

For example, Ryan Morgan was saying that the Cloud Foundry plugin for eclipse is going to be using the ssh support in diego to enable debug of application instances in the context of a buildpack deployed app. This is aligned with other requirements I've heard from people working on dev tools.

As apps reach production, I would hope that interactive ssh is disabled entirely on the prod space leaving only scp in source mode as an option (something the proxy can do).

Between dev and prod, there's a spectrum, but in general, I either expect access to be enabled or disabled - not enabled with a suicidal tendency.

On Thu, Jun 25, 2015 at 10:53 PM, Benjamin Black <[hidden email]> wrote:
matt,

could you elaborate a bit on what you believe ssh access to instances is for? 


b


On Thu, Jun 25, 2015 at 9:29 PM, Matthew Sykes <[hidden email]> wrote:
My concern is the default behavior.

When I first prototyped this support in February, I never expected that merely accessing a container would cause it to be terminated. As we can see from Jan's response, it's completely unexpected; many others have the same reaction.

I do not believe that this behavior should be part of the default configuration and I do believe the control needs to be at the space level. I have have already expressed this opinion during Diego retros and at the runtime PMC meeting.

I honestly believe that if we were talking about applying this behavior to `bosh ssh` and `bosh scp`, few would even consider running in a 'kill on taint mode' because of how useful it is. We should learn from that.

If this behavior becomes the default, I think our platform will be seen as moving from opinionated to parochial. That would be unfortunate.


On Thu, Jun 25, 2015 at 6:05 PM, James Bayer <[hidden email]> wrote:
you can turn the "restart tainted containers" feature off with configuration if you are authorized to do so. then using scp to write files into a container would be persisted for the lifetime of the container even after the ssh session ends.

On Thu, Jun 25, 2015 at 5:50 PM, Jan Dubois <[hidden email]> wrote:
On Thu, Jun 25, 2015 at 5:36 PM, Eric Malm <[hidden email]> wrote:
> after executing a command, concluding an
> interactive session, or copying a file into an instance, that instance will
> be restarted.

What is the purpose of being able to copy a file into an instance if
the instance is restarted as soon as the file has been received?

Cheers,
-Jan
_______________________________________________
cf-dev mailing list
[hidden email]
https://lists.cloudfoundry.org/mailman/listinfo/cf-dev



--
Thank you,

James Bayer

_______________________________________________
cf-dev mailing list
[hidden email]
https://lists.cloudfoundry.org/mailman/listinfo/cf-dev




--
Matthew Sykes
[hidden email]

_______________________________________________
cf-dev mailing list
[hidden email]
https://lists.cloudfoundry.org/mailman/listinfo/cf-dev



_______________________________________________
cf-dev mailing list
[hidden email]
https://lists.cloudfoundry.org/mailman/listinfo/cf-dev




--
Matthew Sykes
[hidden email]

_______________________________________________
cf-dev mailing list
[hidden email]
https://lists.cloudfoundry.org/mailman/listinfo/cf-dev
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [cf-dev] SSH access to CF app instances on Diego

Dieu Cao
I think with the CLI we could add clarifying messaging when using ssh what the current policy around recycling is.
Eric, what do you think about calling it the "recycling" policy, enabled by default? =D

-Dieu


On Sat, Jun 27, 2015 at 3:42 AM, Matthew Sykes <[hidden email]> wrote:
Depends on your role and where your app is in the deployment pipeline. Most of the scenarios I envisioned were for the tail end of development where you need to poke around to debug and figure out those last few problems.

For example, Ryan Morgan was saying that the Cloud Foundry plugin for eclipse is going to be using the ssh support in diego to enable debug of application instances in the context of a buildpack deployed app. This is aligned with other requirements I've heard from people working on dev tools.

As apps reach production, I would hope that interactive ssh is disabled entirely on the prod space leaving only scp in source mode as an option (something the proxy can do).

Between dev and prod, there's a spectrum, but in general, I either expect access to be enabled or disabled - not enabled with a suicidal tendency.

On Thu, Jun 25, 2015 at 10:53 PM, Benjamin Black <[hidden email]> wrote:
matt,

could you elaborate a bit on what you believe ssh access to instances is for? 


b


On Thu, Jun 25, 2015 at 9:29 PM, Matthew Sykes <[hidden email]> wrote:
My concern is the default behavior.

When I first prototyped this support in February, I never expected that merely accessing a container would cause it to be terminated. As we can see from Jan's response, it's completely unexpected; many others have the same reaction.

I do not believe that this behavior should be part of the default configuration and I do believe the control needs to be at the space level. I have have already expressed this opinion during Diego retros and at the runtime PMC meeting.

I honestly believe that if we were talking about applying this behavior to `bosh ssh` and `bosh scp`, few would even consider running in a 'kill on taint mode' because of how useful it is. We should learn from that.

If this behavior becomes the default, I think our platform will be seen as moving from opinionated to parochial. That would be unfortunate.


On Thu, Jun 25, 2015 at 6:05 PM, James Bayer <[hidden email]> wrote:
you can turn the "restart tainted containers" feature off with configuration if you are authorized to do so. then using scp to write files into a container would be persisted for the lifetime of the container even after the ssh session ends.

On Thu, Jun 25, 2015 at 5:50 PM, Jan Dubois <[hidden email]> wrote:
On Thu, Jun 25, 2015 at 5:36 PM, Eric Malm <[hidden email]> wrote:
> after executing a command, concluding an
> interactive session, or copying a file into an instance, that instance will
> be restarted.

What is the purpose of being able to copy a file into an instance if
the instance is restarted as soon as the file has been received?

Cheers,
-Jan
_______________________________________________
cf-dev mailing list
[hidden email]
https://lists.cloudfoundry.org/mailman/listinfo/cf-dev



--
Thank you,

James Bayer

_______________________________________________
cf-dev mailing list
[hidden email]
https://lists.cloudfoundry.org/mailman/listinfo/cf-dev




--
Matthew Sykes
[hidden email]

_______________________________________________
cf-dev mailing list
[hidden email]
https://lists.cloudfoundry.org/mailman/listinfo/cf-dev



_______________________________________________
cf-dev mailing list
[hidden email]
https://lists.cloudfoundry.org/mailman/listinfo/cf-dev




--
Matthew Sykes
[hidden email]

_______________________________________________
cf-dev mailing list
[hidden email]
https://lists.cloudfoundry.org/mailman/listinfo/cf-dev



_______________________________________________
cf-dev mailing list
[hidden email]
https://lists.cloudfoundry.org/mailman/listinfo/cf-dev
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [cf-dev] SSH access to CF app instances on Diego

John Wong
> after executing a command, concluding an interactive session, or copying a file into an instance, that instance will be restarted.

How does it monitor the behavior? Is there a list of commands whitelisted? I am curious because I am trying to find out what the whitelist contain. Also is it at the end of the bosh ssh APP_NAME session? What if two users are there simultaneously?

Thanks.

On Mon, Jun 29, 2015 at 5:49 AM, Dieu Cao <[hidden email]> wrote:
I think with the CLI we could add clarifying messaging when using ssh what the current policy around recycling is.
Eric, what do you think about calling it the "recycling" policy, enabled by default? =D

-Dieu


On Sat, Jun 27, 2015 at 3:42 AM, Matthew Sykes <[hidden email]> wrote:
Depends on your role and where your app is in the deployment pipeline. Most of the scenarios I envisioned were for the tail end of development where you need to poke around to debug and figure out those last few problems.

For example, Ryan Morgan was saying that the Cloud Foundry plugin for eclipse is going to be using the ssh support in diego to enable debug of application instances in the context of a buildpack deployed app. This is aligned with other requirements I've heard from people working on dev tools.

As apps reach production, I would hope that interactive ssh is disabled entirely on the prod space leaving only scp in source mode as an option (something the proxy can do).

Between dev and prod, there's a spectrum, but in general, I either expect access to be enabled or disabled - not enabled with a suicidal tendency.

On Thu, Jun 25, 2015 at 10:53 PM, Benjamin Black <[hidden email]> wrote:
matt,

could you elaborate a bit on what you believe ssh access to instances is for? 


b


On Thu, Jun 25, 2015 at 9:29 PM, Matthew Sykes <[hidden email]> wrote:
My concern is the default behavior.

When I first prototyped this support in February, I never expected that merely accessing a container would cause it to be terminated. As we can see from Jan's response, it's completely unexpected; many others have the same reaction.

I do not believe that this behavior should be part of the default configuration and I do believe the control needs to be at the space level. I have have already expressed this opinion during Diego retros and at the runtime PMC meeting.

I honestly believe that if we were talking about applying this behavior to `bosh ssh` and `bosh scp`, few would even consider running in a 'kill on taint mode' because of how useful it is. We should learn from that.

If this behavior becomes the default, I think our platform will be seen as moving from opinionated to parochial. That would be unfortunate.


On Thu, Jun 25, 2015 at 6:05 PM, James Bayer <[hidden email]> wrote:
you can turn the "restart tainted containers" feature off with configuration if you are authorized to do so. then using scp to write files into a container would be persisted for the lifetime of the container even after the ssh session ends.

On Thu, Jun 25, 2015 at 5:50 PM, Jan Dubois <[hidden email]> wrote:
On Thu, Jun 25, 2015 at 5:36 PM, Eric Malm <[hidden email]> wrote:
> after executing a command, concluding an
> interactive session, or copying a file into an instance, that instance will
> be restarted.

What is the purpose of being able to copy a file into an instance if
the instance is restarted as soon as the file has been received?

Cheers,
-Jan
_______________________________________________
cf-dev mailing list
[hidden email]
https://lists.cloudfoundry.org/mailman/listinfo/cf-dev



--
Thank you,

James Bayer

_______________________________________________
cf-dev mailing list
[hidden email]
https://lists.cloudfoundry.org/mailman/listinfo/cf-dev




--
Matthew Sykes
[hidden email]

_______________________________________________
cf-dev mailing list
[hidden email]
https://lists.cloudfoundry.org/mailman/listinfo/cf-dev



_______________________________________________
cf-dev mailing list
[hidden email]
https://lists.cloudfoundry.org/mailman/listinfo/cf-dev




--
Matthew Sykes
[hidden email]

_______________________________________________
cf-dev mailing list
[hidden email]
https://lists.cloudfoundry.org/mailman/listinfo/cf-dev



_______________________________________________
cf-dev mailing list
[hidden email]
https://lists.cloudfoundry.org/mailman/listinfo/cf-dev



_______________________________________________
cf-dev mailing list
[hidden email]
https://lists.cloudfoundry.org/mailman/listinfo/cf-dev
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [cf-dev] SSH access to CF app instances on Diego

James Myers
I have to agree with Matt on this one. I feel that the recycling of containers is a very anti-developer default. When you approach Cloud Foundry from the perspective of running production applications the recycle policy makes complete sense. However, I feel that this misses out on one of the massive benefits/use cases of Cloud Foundry, what it offers to the development process.

From a security stand point, if you can ssh into a container, it means you have write access to the application in CloudFoundry. Thus you can already push new bits/change the application in question. All of the "papertrail" functionality around pushing/changing applications exists for SSH as well (we record events, output log lines, make it visible to users that action was taken on the application), and thus concerned operators would be able to determine if someone modifying the application in question.

Therefore I'm lost on how this is truly the most secure default. If we are really going by the idea that all defaults should be the most secure, ssh should be disabled by default.

As a developer, I can see many times in which I would want to be able to ssh into my container and change my application as part of a troubleshooting process. Using BOSH as an example, CF Devs constantly ssh into VMs and change the processes running on them in order to facilitate development. BOSH does not reap the VM and redeploy a new instance when you have closed the SSH session. Once again this is largely due to the fact that if you have SSH access, you can already perform the necessary actions to change the application through different means.

Another huge hindrance to development, is that the recycling policy is controlled by administrators. It is not something that normal users can control, even though we allow the granularity of enabling/disabling SSH completely to the end user. This seems counterintuitive.

I feel that a better solution would be to provide the user with some knowledge of which instances may be tainted, and then allowing them to opt into a policy which will reap tainted containers. This provides users with clear insight that their application instance may be a snowflake (and that they may want to take action), while also allowing normal behavior with regards to SSH access to containers.

To summarize, by enabling the recycling policy by default we not only produce extremely unusual behavior / workflows for developers, we are also minimizing the developer-friendliness of CF in general. This mixed with the fact that as a user I cannot even control this policy, leads me to believe that as a default recycling should be turned off as it provides the most cohesive and friendly user experience.

On Mon, Jun 29, 2015 at 9:14 AM, John Wong <[hidden email]> wrote:
> after executing a command, concluding an interactive session, or copying a file into an instance, that instance will be restarted.

How does it monitor the behavior? Is there a list of commands whitelisted? I am curious because I am trying to find out what the whitelist contain. Also is it at the end of the bosh ssh APP_NAME session? What if two users are there simultaneously?

Thanks.

On Mon, Jun 29, 2015 at 5:49 AM, Dieu Cao <[hidden email]> wrote:
I think with the CLI we could add clarifying messaging when using ssh what the current policy around recycling is.
Eric, what do you think about calling it the "recycling" policy, enabled by default? =D

-Dieu


On Sat, Jun 27, 2015 at 3:42 AM, Matthew Sykes <[hidden email]> wrote:
Depends on your role and where your app is in the deployment pipeline. Most of the scenarios I envisioned were for the tail end of development where you need to poke around to debug and figure out those last few problems.

For example, Ryan Morgan was saying that the Cloud Foundry plugin for eclipse is going to be using the ssh support in diego to enable debug of application instances in the context of a buildpack deployed app. This is aligned with other requirements I've heard from people working on dev tools.

As apps reach production, I would hope that interactive ssh is disabled entirely on the prod space leaving only scp in source mode as an option (something the proxy can do).

Between dev and prod, there's a spectrum, but in general, I either expect access to be enabled or disabled - not enabled with a suicidal tendency.

On Thu, Jun 25, 2015 at 10:53 PM, Benjamin Black <[hidden email]> wrote:
matt,

could you elaborate a bit on what you believe ssh access to instances is for? 


b


On Thu, Jun 25, 2015 at 9:29 PM, Matthew Sykes <[hidden email]> wrote:
My concern is the default behavior.

When I first prototyped this support in February, I never expected that merely accessing a container would cause it to be terminated. As we can see from Jan's response, it's completely unexpected; many others have the same reaction.

I do not believe that this behavior should be part of the default configuration and I do believe the control needs to be at the space level. I have have already expressed this opinion during Diego retros and at the runtime PMC meeting.

I honestly believe that if we were talking about applying this behavior to `bosh ssh` and `bosh scp`, few would even consider running in a 'kill on taint mode' because of how useful it is. We should learn from that.

If this behavior becomes the default, I think our platform will be seen as moving from opinionated to parochial. That would be unfortunate.


On Thu, Jun 25, 2015 at 6:05 PM, James Bayer <[hidden email]> wrote:
you can turn the "restart tainted containers" feature off with configuration if you are authorized to do so. then using scp to write files into a container would be persisted for the lifetime of the container even after the ssh session ends.

On Thu, Jun 25, 2015 at 5:50 PM, Jan Dubois <[hidden email]> wrote:
On Thu, Jun 25, 2015 at 5:36 PM, Eric Malm <[hidden email]> wrote:
> after executing a command, concluding an
> interactive session, or copying a file into an instance, that instance will
> be restarted.

What is the purpose of being able to copy a file into an instance if
the instance is restarted as soon as the file has been received?

Cheers,
-Jan
_______________________________________________
cf-dev mailing list
[hidden email]
https://lists.cloudfoundry.org/mailman/listinfo/cf-dev



--
Thank you,

James Bayer

_______________________________________________
cf-dev mailing list
[hidden email]
https://lists.cloudfoundry.org/mailman/listinfo/cf-dev




--
Matthew Sykes
[hidden email]

_______________________________________________
cf-dev mailing list
[hidden email]
https://lists.cloudfoundry.org/mailman/listinfo/cf-dev



_______________________________________________
cf-dev mailing list
[hidden email]
https://lists.cloudfoundry.org/mailman/listinfo/cf-dev




--
Matthew Sykes
[hidden email]

_______________________________________________
cf-dev mailing list
[hidden email]
https://lists.cloudfoundry.org/mailman/listinfo/cf-dev



_______________________________________________
cf-dev mailing list
[hidden email]
https://lists.cloudfoundry.org/mailman/listinfo/cf-dev



_______________________________________________
cf-dev mailing list
[hidden email]
https://lists.cloudfoundry.org/mailman/listinfo/cf-dev



_______________________________________________
cf-dev mailing list
[hidden email]
https://lists.cloudfoundry.org/mailman/listinfo/cf-dev
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [cf-dev] SSH access to CF app instances on Diego

gberche
Hi,

please find my feedback to this thread

short version:
1- need preserve good CF experience with HTTP only (direct SSH flow is still blocked and a pain in many organisations) => +1 to preserve "cf files" or fine tune diego plug to have ssh over HTTP to work out of the box
2- default "recycle tainted containers by default" policy seems good to me
3-  needs to be completed with more control of the recycling policy (UX such as "quarantine" or GAE "lock/unlock" )
4- development use-cases need to be better supported (dev/prod parity) not sure ssh/scp is the right path though

long  version:

1- cf files and ssh over HTTP

As previously mentionned into [1], CF exposing apis over HTTP api made a great job to be easily consummed through HTTP proxies that some companies still use, making CF experience seemless to consumme public paas, or private paas among corporate entities. It seems important to me to preserve good CF experience with HTTP only.

If SSH interactive access, scp and port forwarding become the mainstream solution to operate and troubleshoot apps (supporting "cf files", replacement for the previous DEBUG and CONSOLE ports), it will be useful for users behind such firewalls to be able to configure diego ssh plugin to use HTTP/SOCKS proxies to reach public CF instances. As the diego ssh cli plugin supports using the regular local host ssh binaries, this may potentially be done by tweaking the .ssh config file to add flags associated to host ssh.${domain} to go through proxies (possibly double tunnels as described into  [2]).  However, for new users in such network context, especially on windows operating system, the set up work before using a CF public instance starts to add up?

2- default "recycle tainted containers by default" seems good to me

Given that apps deployed on CF comply to 12 factor apps, there instance may be restarted at anytime (e.g. during a CF new release deployment or stemcell upgrade). So the default policy "recycle tainted containers by default" is not a surprise.

3- need to be completed with more control of the recycling policy (UX such as "quarantine" or GAE "lock/unlock" )

There are some specific use-cases where the "recycle tainted containers by default" policy would be problematic when running applications in production:

An application instance is malfunctionning (e.g. hanging) and an interactive debugging is necessary. The app-ops ssh into the container and starts taking some diagnostic steps (e.g sending kill -SIGTERM signals to take thread dumps, or locally changes log levels).

If ever the ssh connection breaks/timeout, the "recycle tainted containers by default, preventing the current diagnostc to complete.

Another similar use case: a production application is suspected to be compromised by an attacker. App-ops need to capture evidences and understand better how the abuse was done. There isn't enough information in streamed logs, and there is a need to get into the container to inspect the ephemeral FS  and the processes and memory. This may require more than one simultanenous SSH connection, and may span on multiple hours

In both use-cases above, while the application is 12 factor compliant and the "recycle tainted containers by default" policy would be opted in on the corresponding space, there would be a need to transiently turn the mode off.

In term of user experience, this may appear as an explicit user request to "quarantine" the tainted app instances (or the whoe app) so that CF does not attempt to restart them. Or it may appear as the google app engine "lock/unlock"

a call to a new "unlock" command to a CF app instance would be necessary to get SSH access to it. CF then considers this instance as "tained"/untrusted, as it may have deviated from the pushed content, and does not act to it anymore (i.e. does not monitor its bound $PORT or root process exit, which may be handy to diagnose it as wish). When the "lock" command is requested on this instance, Cf destroys this tainted instance, and recreates a fresh new "trusted" one.

4- development use-cases need to be better supported (dev/prod parity) not sure ssh/scp is the right path though

I agree with James Myers that development use-cases should be better supported.

First, CF should strive to support dev-prod parity [4]. However currently, there is not anymore a version of CF that a developper can run on his laptop (e.g. when doing offline development during commute) that would behave like prod and embed buildpacks. There used to have "CF on a single VM". Heroku or GAE have emulators. Cloud rocker [5] is close, but it still takes 10s or more to have changes made on the app be reflected into a running app.

There are some legitimate use cases during development for modifying sources of the application and have those changes be taken in effect immediately. Lots of app development framework supports those development modes (even those that promote test-driven practices), and getting a fast feedback is important. Having dev-prod parity means supporting these use cases while preserving prod behavior (having the VCAP_SERVICES and VCAP_APPLICATION and the buildpack processing applied on the same stack (cflinux2)). Being able to run offline would be even better.

I however believe that providing SSH/SCP access to change the file system to a running app instance may not be the appropriate response, given the FS and the app instance is still ephemeral. Who would want to modify files that could be lost at any time (e.g. a stemcell upgrade ) ?

I'd rather see value in further exploring the ideas layed out by James Bayer into [5] e.g. as a form of a git repo populated with the /home/vcap/app subdir, that developers could clone, push to, and have the instance epheremal FS updated with pushed changes.

This may be combined with a cloudrocker mechanism as to work with a fully offline mode when this is required.
[5] https://docs.google.com/document/d/1_C3OWS6giWx4JL_IL9YLA6jcppyQLVD-YjR0GeA8Z0s/edit#heading=h.toypuu5pxh65


On Thu, Jul 2, 2015 at 10:18 PM, James Myers <[hidden email]> wrote:
I have to agree with Matt on this one. I feel that the recycling of containers is a very anti-developer default. When you approach Cloud Foundry from the perspective of running production applications the recycle policy makes complete sense. However, I feel that this misses out on one of the massive benefits/use cases of Cloud Foundry, what it offers to the development process.

From a security stand point, if you can ssh into a container, it means you have write access to the application in CloudFoundry. Thus you can already push new bits/change the application in question. All of the "papertrail" functionality around pushing/changing applications exists for SSH as well (we record events, output log lines, make it visible to users that action was taken on the application), and thus concerned operators would be able to determine if someone modifying the application in question.

Therefore I'm lost on how this is truly the most secure default. If we are really going by the idea that all defaults should be the most secure, ssh should be disabled by default.

As a developer, I can see many times in which I would want to be able to ssh into my container and change my application as part of a troubleshooting process. Using BOSH as an example, CF Devs constantly ssh into VMs and change the processes running on them in order to facilitate development. BOSH does not reap the VM and redeploy a new instance when you have closed the SSH session. Once again this is largely due to the fact that if you have SSH access, you can already perform the necessary actions to change the application through different means.

Another huge hindrance to development, is that the recycling policy is controlled by administrators. It is not something that normal users can control, even though we allow the granularity of enabling/disabling SSH completely to the end user. This seems counterintuitive.

I feel that a better solution would be to provide the user with some knowledge of which instances may be tainted, and then allowing them to opt into a policy which will reap tainted containers. This provides users with clear insight that their application instance may be a snowflake (and that they may want to take action), while also allowing normal behavior with regards to SSH access to containers.

To summarize, by enabling the recycling policy by default we not only produce extremely unusual behavior / workflows for developers, we are also minimizing the developer-friendliness of CF in general. This mixed with the fact that as a user I cannot even control this policy, leads me to believe that as a default recycling should be turned off as it provides the most cohesive and friendly user experience.

On Mon, Jun 29, 2015 at 9:14 AM, John Wong <[hidden email]> wrote:
> after executing a command, concluding an interactive session, or copying a file into an instance, that instance will be restarted.

How does it monitor the behavior? Is there a list of commands whitelisted? I am curious because I am trying to find out what the whitelist contain. Also is it at the end of the bosh ssh APP_NAME session? What if two users are there simultaneously?

Thanks.

On Mon, Jun 29, 2015 at 5:49 AM, Dieu Cao <[hidden email]> wrote:
I think with the CLI we could add clarifying messaging when using ssh what the current policy around recycling is.
Eric, what do you think about calling it the "recycling" policy, enabled by default? =D

-Dieu


On Sat, Jun 27, 2015 at 3:42 AM, Matthew Sykes <[hidden email]> wrote:
Depends on your role and where your app is in the deployment pipeline. Most of the scenarios I envisioned were for the tail end of development where you need to poke around to debug and figure out those last few problems.

For example, Ryan Morgan was saying that the Cloud Foundry plugin for eclipse is going to be using the ssh support in diego to enable debug of application instances in the context of a buildpack deployed app. This is aligned with other requirements I've heard from people working on dev tools.

As apps reach production, I would hope that interactive ssh is disabled entirely on the prod space leaving only scp in source mode as an option (something the proxy can do).

Between dev and prod, there's a spectrum, but in general, I either expect access to be enabled or disabled - not enabled with a suicidal tendency.

On Thu, Jun 25, 2015 at 10:53 PM, Benjamin Black <[hidden email]> wrote:
matt,

could you elaborate a bit on what you believe ssh access to instances is for? 


b


On Thu, Jun 25, 2015 at 9:29 PM, Matthew Sykes <[hidden email]> wrote:
My concern is the default behavior.

When I first prototyped this support in February, I never expected that merely accessing a container would cause it to be terminated. As we can see from Jan's response, it's completely unexpected; many others have the same reaction.

I do not believe that this behavior should be part of the default configuration and I do believe the control needs to be at the space level. I have have already expressed this opinion during Diego retros and at the runtime PMC meeting.

I honestly believe that if we were talking about applying this behavior to `bosh ssh` and `bosh scp`, few would even consider running in a 'kill on taint mode' because of how useful it is. We should learn from that.

If this behavior becomes the default, I think our platform will be seen as moving from opinionated to parochial. That would be unfortunate.


On Thu, Jun 25, 2015 at 6:05 PM, James Bayer <[hidden email]> wrote:
you can turn the "restart tainted containers" feature off with configuration if you are authorized to do so. then using scp to write files into a container would be persisted for the lifetime of the container even after the ssh session ends.

On Thu, Jun 25, 2015 at 5:50 PM, Jan Dubois <[hidden email]> wrote:
On Thu, Jun 25, 2015 at 5:36 PM, Eric Malm <[hidden email]> wrote:
> after executing a command, concluding an
> interactive session, or copying a file into an instance, that instance will
> be restarted.

What is the purpose of being able to copy a file into an instance if
the instance is restarted as soon as the file has been received?

Cheers,
-Jan
_______________________________________________
cf-dev mailing list
[hidden email]
https://lists.cloudfoundry.org/mailman/listinfo/cf-dev



--
Thank you,

James Bayer

_______________________________________________
cf-dev mailing list
[hidden email]
https://lists.cloudfoundry.org/mailman/listinfo/cf-dev




--
Matthew Sykes
[hidden email]

_______________________________________________
cf-dev mailing list
[hidden email]
https://lists.cloudfoundry.org/mailman/listinfo/cf-dev



_______________________________________________
cf-dev mailing list
[hidden email]
https://lists.cloudfoundry.org/mailman/listinfo/cf-dev




--
Matthew Sykes
[hidden email]

_______________________________________________
cf-dev mailing list
[hidden email]
https://lists.cloudfoundry.org/mailman/listinfo/cf-dev



_______________________________________________
cf-dev mailing list
[hidden email]
https://lists.cloudfoundry.org/mailman/listinfo/cf-dev



_______________________________________________
cf-dev mailing list
[hidden email]
https://lists.cloudfoundry.org/mailman/listinfo/cf-dev



_______________________________________________
cf-dev mailing list
[hidden email]
https://lists.cloudfoundry.org/mailman/listinfo/cf-dev



_______________________________________________
cf-dev mailing list
[hidden email]
https://lists.cloudfoundry.org/mailman/listinfo/cf-dev
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [cf-dev] SSH access to CF app instances on Diego

gberche
In reply to this post by James Myers
Following up on James's description of the "papertrail" ssh audit traces that the diego-ssh support is adding.

This is very useful to have these traces. Can you confirm these traces are provided through loggregator (and don't appear in the cc events) ?  I'm however wondering how reliable can the loggregator-based logs be (as loggregator is lossy and not designed to support reliable transport of logs). While I understand there have been recent efforts to reduce the lossy rate of loggregator, I'm wondering how easy it would be for a CF user to cover its tracks (i.e. its "diego ssh" log entries), e.g. simply flooding the loggregator with user traffic (having RTR and diego compete for throughput into loggregator for a given app).

Thanks,

Guillaume.

On Thu, Jul 2, 2015 at 10:18 PM, James Myers <[hidden email]> wrote:

From a security stand point, if you can ssh into a container, it means you have write access to the application in CloudFoundry. Thus you can already push new bits/change the application in question. All of the "papertrail" functionality around pushing/changing applications exists for SSH as well (we record events, output log lines, make it visible to users that action was taken on the application), and thus concerned operators would be able to determine if someone modifying the application in question.



_______________________________________________
cf-dev mailing list
[hidden email]
https://lists.cloudfoundry.org/mailman/listinfo/cf-dev
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [cf-dev] SSH access to CF app instances on Diego

gberche
In reply to this post by gberche
Eric,

The CAB minutes [1] mentionned you were still looking for feedback from the community on the policy for altered instances, but this thread seems silent for a while.

Not sure you had seen my email and suggestion below for a way to quarantine the altered instances (beyond the per-space restart policy configuration). Such quarantine request might be a good place to include option to ask for the quarantine instances to be excluded from gorouter traffic.
Regards,

Guillaume.

On Fri, Jul 3, 2015 at 3:56 PM, Guillaume Berche <[hidden email]> wrote:
Hi,

please find my feedback to this thread

short version:
1- need preserve good CF experience with HTTP only (direct SSH flow is still blocked and a pain in many organisations) => +1 to preserve "cf files" or fine tune diego plug to have ssh over HTTP to work out of the box
2- default "recycle tainted containers by default" policy seems good to me
3-  needs to be completed with more control of the recycling policy (UX such as "quarantine" or GAE "lock/unlock" )
4- development use-cases need to be better supported (dev/prod parity) not sure ssh/scp is the right path though

long  version:

1- cf files and ssh over HTTP

As previously mentionned into [1], CF exposing apis over HTTP api made a great job to be easily consummed through HTTP proxies that some companies still use, making CF experience seemless to consumme public paas, or private paas among corporate entities. It seems important to me to preserve good CF experience with HTTP only.

If SSH interactive access, scp and port forwarding become the mainstream solution to operate and troubleshoot apps (supporting "cf files", replacement for the previous DEBUG and CONSOLE ports), it will be useful for users behind such firewalls to be able to configure diego ssh plugin to use HTTP/SOCKS proxies to reach public CF instances. As the diego ssh cli plugin supports using the regular local host ssh binaries, this may potentially be done by tweaking the .ssh config file to add flags associated to host ssh.${domain} to go through proxies (possibly double tunnels as described into  [2]).  However, for new users in such network context, especially on windows operating system, the set up work before using a CF public instance starts to add up?

2- default "recycle tainted containers by default" seems good to me

Given that apps deployed on CF comply to 12 factor apps, there instance may be restarted at anytime (e.g. during a CF new release deployment or stemcell upgrade). So the default policy "recycle tainted containers by default" is not a surprise.

3- need to be completed with more control of the recycling policy (UX such as "quarantine" or GAE "lock/unlock" )

There are some specific use-cases where the "recycle tainted containers by default" policy would be problematic when running applications in production:

An application instance is malfunctionning (e.g. hanging) and an interactive debugging is necessary. The app-ops ssh into the container and starts taking some diagnostic steps (e.g sending kill -SIGTERM signals to take thread dumps, or locally changes log levels).

If ever the ssh connection breaks/timeout, the "recycle tainted containers by default, preventing the current diagnostc to complete.

Another similar use case: a production application is suspected to be compromised by an attacker. App-ops need to capture evidences and understand better how the abuse was done. There isn't enough information in streamed logs, and there is a need to get into the container to inspect the ephemeral FS  and the processes and memory. This may require more than one simultanenous SSH connection, and may span on multiple hours

In both use-cases above, while the application is 12 factor compliant and the "recycle tainted containers by default" policy would be opted in on the corresponding space, there would be a need to transiently turn the mode off.

In term of user experience, this may appear as an explicit user request to "quarantine" the tainted app instances (or the whoe app) so that CF does not attempt to restart them. Or it may appear as the google app engine "lock/unlock"

a call to a new "unlock" command to a CF app instance would be necessary to get SSH access to it. CF then considers this instance as "tained"/untrusted, as it may have deviated from the pushed content, and does not act to it anymore (i.e. does not monitor its bound $PORT or root process exit, which may be handy to diagnose it as wish). When the "lock" command is requested on this instance, Cf destroys this tainted instance, and recreates a fresh new "trusted" one.

4- development use-cases need to be better supported (dev/prod parity) not sure ssh/scp is the right path though

I agree with James Myers that development use-cases should be better supported.

First, CF should strive to support dev-prod parity [4]. However currently, there is not anymore a version of CF that a developper can run on his laptop (e.g. when doing offline development during commute) that would behave like prod and embed buildpacks. There used to have "CF on a single VM". Heroku or GAE have emulators. Cloud rocker [5] is close, but it still takes 10s or more to have changes made on the app be reflected into a running app.

There are some legitimate use cases during development for modifying sources of the application and have those changes be taken in effect immediately. Lots of app development framework supports those development modes (even those that promote test-driven practices), and getting a fast feedback is important. Having dev-prod parity means supporting these use cases while preserving prod behavior (having the VCAP_SERVICES and VCAP_APPLICATION and the buildpack processing applied on the same stack (cflinux2)). Being able to run offline would be even better.

I however believe that providing SSH/SCP access to change the file system to a running app instance may not be the appropriate response, given the FS and the app instance is still ephemeral. Who would want to modify files that could be lost at any time (e.g. a stemcell upgrade ) ?

I'd rather see value in further exploring the ideas layed out by James Bayer into [5] e.g. as a form of a git repo populated with the /home/vcap/app subdir, that developers could clone, push to, and have the instance epheremal FS updated with pushed changes.

This may be combined with a cloudrocker mechanism as to work with a fully offline mode when this is required.
[5] https://docs.google.com/document/d/1_C3OWS6giWx4JL_IL9YLA6jcppyQLVD-YjR0GeA8Z0s/edit#heading=h.toypuu5pxh65



On Thu, Jul 2, 2015 at 10:18 PM, James Myers <[hidden email]> wrote:
I have to agree with Matt on this one. I feel that the recycling of containers is a very anti-developer default. When you approach Cloud Foundry from the perspective of running production applications the recycle policy makes complete sense. However, I feel that this misses out on one of the massive benefits/use cases of Cloud Foundry, what it offers to the development process.

From a security stand point, if you can ssh into a container, it means you have write access to the application in CloudFoundry. Thus you can already push new bits/change the application in question. All of the "papertrail" functionality around pushing/changing applications exists for SSH as well (we record events, output log lines, make it visible to users that action was taken on the application), and thus concerned operators would be able to determine if someone modifying the application in question.

Therefore I'm lost on how this is truly the most secure default. If we are really going by the idea that all defaults should be the most secure, ssh should be disabled by default.

As a developer, I can see many times in which I would want to be able to ssh into my container and change my application as part of a troubleshooting process. Using BOSH as an example, CF Devs constantly ssh into VMs and change the processes running on them in order to facilitate development. BOSH does not reap the VM and redeploy a new instance when you have closed the SSH session. Once again this is largely due to the fact that if you have SSH access, you can already perform the necessary actions to change the application through different means.

Another huge hindrance to development, is that the recycling policy is controlled by administrators. It is not something that normal users can control, even though we allow the granularity of enabling/disabling SSH completely to the end user. This seems counterintuitive.

I feel that a better solution would be to provide the user with some knowledge of which instances may be tainted, and then allowing them to opt into a policy which will reap tainted containers. This provides users with clear insight that their application instance may be a snowflake (and that they may want to take action), while also allowing normal behavior with regards to SSH access to containers.

To summarize, by enabling the recycling policy by default we not only produce extremely unusual behavior / workflows for developers, we are also minimizing the developer-friendliness of CF in general. This mixed with the fact that as a user I cannot even control this policy, leads me to believe that as a default recycling should be turned off as it provides the most cohesive and friendly user experience.

On Mon, Jun 29, 2015 at 9:14 AM, John Wong <[hidden email]> wrote:
> after executing a command, concluding an interactive session, or copying a file into an instance, that instance will be restarted.

How does it monitor the behavior? Is there a list of commands whitelisted? I am curious because I am trying to find out what the whitelist contain. Also is it at the end of the bosh ssh APP_NAME session? What if two users are there simultaneously?

Thanks.

On Mon, Jun 29, 2015 at 5:49 AM, Dieu Cao <[hidden email]> wrote:
I think with the CLI we could add clarifying messaging when using ssh what the current policy around recycling is.
Eric, what do you think about calling it the "recycling" policy, enabled by default? =D

-Dieu


On Sat, Jun 27, 2015 at 3:42 AM, Matthew Sykes <[hidden email]> wrote:
Depends on your role and where your app is in the deployment pipeline. Most of the scenarios I envisioned were for the tail end of development where you need to poke around to debug and figure out those last few problems.

For example, Ryan Morgan was saying that the Cloud Foundry plugin for eclipse is going to be using the ssh support in diego to enable debug of application instances in the context of a buildpack deployed app. This is aligned with other requirements I've heard from people working on dev tools.

As apps reach production, I would hope that interactive ssh is disabled entirely on the prod space leaving only scp in source mode as an option (something the proxy can do).

Between dev and prod, there's a spectrum, but in general, I either expect access to be enabled or disabled - not enabled with a suicidal tendency.

On Thu, Jun 25, 2015 at 10:53 PM, Benjamin Black <[hidden email]> wrote:
matt,

could you elaborate a bit on what you believe ssh access to instances is for? 


b


On Thu, Jun 25, 2015 at 9:29 PM, Matthew Sykes <[hidden email]> wrote:
My concern is the default behavior.

When I first prototyped this support in February, I never expected that merely accessing a container would cause it to be terminated. As we can see from Jan's response, it's completely unexpected; many others have the same reaction.

I do not believe that this behavior should be part of the default configuration and I do believe the control needs to be at the space level. I have have already expressed this opinion during Diego retros and at the runtime PMC meeting.

I honestly believe that if we were talking about applying this behavior to `bosh ssh` and `bosh scp`, few would even consider running in a 'kill on taint mode' because of how useful it is. We should learn from that.

If this behavior becomes the default, I think our platform will be seen as moving from opinionated to parochial. That would be unfortunate.


On Thu, Jun 25, 2015 at 6:05 PM, James Bayer <[hidden email]> wrote:
you can turn the "restart tainted containers" feature off with configuration if you are authorized to do so. then using scp to write files into a container would be persisted for the lifetime of the container even after the ssh session ends.

On Thu, Jun 25, 2015 at 5:50 PM, Jan Dubois <[hidden email]> wrote:
On Thu, Jun 25, 2015 at 5:36 PM, Eric Malm <[hidden email]> wrote:
> after executing a command, concluding an
> interactive session, or copying a file into an instance, that instance will
> be restarted.

What is the purpose of being able to copy a file into an instance if
the instance is restarted as soon as the file has been received?

Cheers,
-Jan
_______________________________________________
cf-dev mailing list
[hidden email]
https://lists.cloudfoundry.org/mailman/listinfo/cf-dev



--
Thank you,

James Bayer

_______________________________________________
cf-dev mailing list
[hidden email]
https://lists.cloudfoundry.org/mailman/listinfo/cf-dev




--
Matthew Sykes
[hidden email]

_______________________________________________
cf-dev mailing list
[hidden email]
https://lists.cloudfoundry.org/mailman/listinfo/cf-dev



_______________________________________________
cf-dev mailing list
[hidden email]
https://lists.cloudfoundry.org/mailman/listinfo/cf-dev




--
Matthew Sykes
[hidden email]

_______________________________________________
cf-dev mailing list
[hidden email]
https://lists.cloudfoundry.org/mailman/listinfo/cf-dev



_______________________________________________
cf-dev mailing list
[hidden email]
https://lists.cloudfoundry.org/mailman/listinfo/cf-dev



_______________________________________________
cf-dev mailing list
[hidden email]
https://lists.cloudfoundry.org/mailman/listinfo/cf-dev



_______________________________________________
cf-dev mailing list
[hidden email]
https://lists.cloudfoundry.org/mailman/listinfo/cf-dev




_______________________________________________
cf-dev mailing list
[hidden email]
https://lists.cloudfoundry.org/mailman/listinfo/cf-dev
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [cf-dev] SSH access to CF app instances on Diego

Gwenn Etourneau
Guillaume,

Not sure is related but the ssh have been already implemented and is available on diego and cf-release.


On Thu, Jul 30, 2015 at 10:35 PM, Guillaume Berche <[hidden email]> wrote:
Eric,

The CAB minutes [1] mentionned you were still looking for feedback from the community on the policy for altered instances, but this thread seems silent for a while.

Not sure you had seen my email and suggestion below for a way to quarantine the altered instances (beyond the per-space restart policy configuration). Such quarantine request might be a good place to include option to ask for the quarantine instances to be excluded from gorouter traffic.
Regards,

Guillaume.

On Fri, Jul 3, 2015 at 3:56 PM, Guillaume Berche <[hidden email]> wrote:
Hi,

please find my feedback to this thread

short version:
1- need preserve good CF experience with HTTP only (direct SSH flow is still blocked and a pain in many organisations) => +1 to preserve "cf files" or fine tune diego plug to have ssh over HTTP to work out of the box
2- default "recycle tainted containers by default" policy seems good to me
3-  needs to be completed with more control of the recycling policy (UX such as "quarantine" or GAE "lock/unlock" )
4- development use-cases need to be better supported (dev/prod parity) not sure ssh/scp is the right path though

long  version:

1- cf files and ssh over HTTP

As previously mentionned into [1], CF exposing apis over HTTP api made a great job to be easily consummed through HTTP proxies that some companies still use, making CF experience seemless to consumme public paas, or private paas among corporate entities. It seems important to me to preserve good CF experience with HTTP only.

If SSH interactive access, scp and port forwarding become the mainstream solution to operate and troubleshoot apps (supporting "cf files", replacement for the previous DEBUG and CONSOLE ports), it will be useful for users behind such firewalls to be able to configure diego ssh plugin to use HTTP/SOCKS proxies to reach public CF instances. As the diego ssh cli plugin supports using the regular local host ssh binaries, this may potentially be done by tweaking the .ssh config file to add flags associated to host ssh.${domain} to go through proxies (possibly double tunnels as described into  [2]).  However, for new users in such network context, especially on windows operating system, the set up work before using a CF public instance starts to add up?

2- default "recycle tainted containers by default" seems good to me

Given that apps deployed on CF comply to 12 factor apps, there instance may be restarted at anytime (e.g. during a CF new release deployment or stemcell upgrade). So the default policy "recycle tainted containers by default" is not a surprise.

3- need to be completed with more control of the recycling policy (UX such as "quarantine" or GAE "lock/unlock" )

There are some specific use-cases where the "recycle tainted containers by default" policy would be problematic when running applications in production:

An application instance is malfunctionning (e.g. hanging) and an interactive debugging is necessary. The app-ops ssh into the container and starts taking some diagnostic steps (e.g sending kill -SIGTERM signals to take thread dumps, or locally changes log levels).

If ever the ssh connection breaks/timeout, the "recycle tainted containers by default, preventing the current diagnostc to complete.

Another similar use case: a production application is suspected to be compromised by an attacker. App-ops need to capture evidences and understand better how the abuse was done. There isn't enough information in streamed logs, and there is a need to get into the container to inspect the ephemeral FS  and the processes and memory. This may require more than one simultanenous SSH connection, and may span on multiple hours

In both use-cases above, while the application is 12 factor compliant and the "recycle tainted containers by default" policy would be opted in on the corresponding space, there would be a need to transiently turn the mode off.

In term of user experience, this may appear as an explicit user request to "quarantine" the tainted app instances (or the whoe app) so that CF does not attempt to restart them. Or it may appear as the google app engine "lock/unlock"

a call to a new "unlock" command to a CF app instance would be necessary to get SSH access to it. CF then considers this instance as "tained"/untrusted, as it may have deviated from the pushed content, and does not act to it anymore (i.e. does not monitor its bound $PORT or root process exit, which may be handy to diagnose it as wish). When the "lock" command is requested on this instance, Cf destroys this tainted instance, and recreates a fresh new "trusted" one.

4- development use-cases need to be better supported (dev/prod parity) not sure ssh/scp is the right path though

I agree with James Myers that development use-cases should be better supported.

First, CF should strive to support dev-prod parity [4]. However currently, there is not anymore a version of CF that a developper can run on his laptop (e.g. when doing offline development during commute) that would behave like prod and embed buildpacks. There used to have "CF on a single VM". Heroku or GAE have emulators. Cloud rocker [5] is close, but it still takes 10s or more to have changes made on the app be reflected into a running app.

There are some legitimate use cases during development for modifying sources of the application and have those changes be taken in effect immediately. Lots of app development framework supports those development modes (even those that promote test-driven practices), and getting a fast feedback is important. Having dev-prod parity means supporting these use cases while preserving prod behavior (having the VCAP_SERVICES and VCAP_APPLICATION and the buildpack processing applied on the same stack (cflinux2)). Being able to run offline would be even better.

I however believe that providing SSH/SCP access to change the file system to a running app instance may not be the appropriate response, given the FS and the app instance is still ephemeral. Who would want to modify files that could be lost at any time (e.g. a stemcell upgrade ) ?

I'd rather see value in further exploring the ideas layed out by James Bayer into [5] e.g. as a form of a git repo populated with the /home/vcap/app subdir, that developers could clone, push to, and have the instance epheremal FS updated with pushed changes.

This may be combined with a cloudrocker mechanism as to work with a fully offline mode when this is required.
[5] https://docs.google.com/document/d/1_C3OWS6giWx4JL_IL9YLA6jcppyQLVD-YjR0GeA8Z0s/edit#heading=h.toypuu5pxh65



On Thu, Jul 2, 2015 at 10:18 PM, James Myers <[hidden email]> wrote:
I have to agree with Matt on this one. I feel that the recycling of containers is a very anti-developer default. When you approach Cloud Foundry from the perspective of running production applications the recycle policy makes complete sense. However, I feel that this misses out on one of the massive benefits/use cases of Cloud Foundry, what it offers to the development process.

From a security stand point, if you can ssh into a container, it means you have write access to the application in CloudFoundry. Thus you can already push new bits/change the application in question. All of the "papertrail" functionality around pushing/changing applications exists for SSH as well (we record events, output log lines, make it visible to users that action was taken on the application), and thus concerned operators would be able to determine if someone modifying the application in question.

Therefore I'm lost on how this is truly the most secure default. If we are really going by the idea that all defaults should be the most secure, ssh should be disabled by default.

As a developer, I can see many times in which I would want to be able to ssh into my container and change my application as part of a troubleshooting process. Using BOSH as an example, CF Devs constantly ssh into VMs and change the processes running on them in order to facilitate development. BOSH does not reap the VM and redeploy a new instance when you have closed the SSH session. Once again this is largely due to the fact that if you have SSH access, you can already perform the necessary actions to change the application through different means.

Another huge hindrance to development, is that the recycling policy is controlled by administrators. It is not something that normal users can control, even though we allow the granularity of enabling/disabling SSH completely to the end user. This seems counterintuitive.

I feel that a better solution would be to provide the user with some knowledge of which instances may be tainted, and then allowing them to opt into a policy which will reap tainted containers. This provides users with clear insight that their application instance may be a snowflake (and that they may want to take action), while also allowing normal behavior with regards to SSH access to containers.

To summarize, by enabling the recycling policy by default we not only produce extremely unusual behavior / workflows for developers, we are also minimizing the developer-friendliness of CF in general. This mixed with the fact that as a user I cannot even control this policy, leads me to believe that as a default recycling should be turned off as it provides the most cohesive and friendly user experience.

On Mon, Jun 29, 2015 at 9:14 AM, John Wong <[hidden email]> wrote:
> after executing a command, concluding an interactive session, or copying a file into an instance, that instance will be restarted.

How does it monitor the behavior? Is there a list of commands whitelisted? I am curious because I am trying to find out what the whitelist contain. Also is it at the end of the bosh ssh APP_NAME session? What if two users are there simultaneously?

Thanks.

On Mon, Jun 29, 2015 at 5:49 AM, Dieu Cao <[hidden email]> wrote:
I think with the CLI we could add clarifying messaging when using ssh what the current policy around recycling is.
Eric, what do you think about calling it the "recycling" policy, enabled by default? =D

-Dieu


On Sat, Jun 27, 2015 at 3:42 AM, Matthew Sykes <[hidden email]> wrote:
Depends on your role and where your app is in the deployment pipeline. Most of the scenarios I envisioned were for the tail end of development where you need to poke around to debug and figure out those last few problems.

For example, Ryan Morgan was saying that the Cloud Foundry plugin for eclipse is going to be using the ssh support in diego to enable debug of application instances in the context of a buildpack deployed app. This is aligned with other requirements I've heard from people working on dev tools.

As apps reach production, I would hope that interactive ssh is disabled entirely on the prod space leaving only scp in source mode as an option (something the proxy can do).

Between dev and prod, there's a spectrum, but in general, I either expect access to be enabled or disabled - not enabled with a suicidal tendency.

On Thu, Jun 25, 2015 at 10:53 PM, Benjamin Black <[hidden email]> wrote:
matt,

could you elaborate a bit on what you believe ssh access to instances is for? 


b


On Thu, Jun 25, 2015 at 9:29 PM, Matthew Sykes <[hidden email]> wrote:
My concern is the default behavior.

When I first prototyped this support in February, I never expected that merely accessing a container would cause it to be terminated. As we can see from Jan's response, it's completely unexpected; many others have the same reaction.

I do not believe that this behavior should be part of the default configuration and I do believe the control needs to be at the space level. I have have already expressed this opinion during Diego retros and at the runtime PMC meeting.

I honestly believe that if we were talking about applying this behavior to `bosh ssh` and `bosh scp`, few would even consider running in a 'kill on taint mode' because of how useful it is. We should learn from that.

If this behavior becomes the default, I think our platform will be seen as moving from opinionated to parochial. That would be unfortunate.


On Thu, Jun 25, 2015 at 6:05 PM, James Bayer <[hidden email]> wrote:
you can turn the "restart tainted containers" feature off with configuration if you are authorized to do so. then using scp to write files into a container would be persisted for the lifetime of the container even after the ssh session ends.

On Thu, Jun 25, 2015 at 5:50 PM, Jan Dubois <[hidden email]> wrote:
On Thu, Jun 25, 2015 at 5:36 PM, Eric Malm <[hidden email]> wrote:
> after executing a command, concluding an
> interactive session, or copying a file into an instance, that instance will
> be restarted.

What is the purpose of being able to copy a file into an instance if
the instance is restarted as soon as the file has been received?

Cheers,
-Jan
_______________________________________________
cf-dev mailing list
[hidden email]
https://lists.cloudfoundry.org/mailman/listinfo/cf-dev



--
Thank you,

James Bayer

_______________________________________________
cf-dev mailing list
[hidden email]
https://lists.cloudfoundry.org/mailman/listinfo/cf-dev




--
Matthew Sykes
[hidden email]

_______________________________________________
cf-dev mailing list
[hidden email]
https://lists.cloudfoundry.org/mailman/listinfo/cf-dev



_______________________________________________
cf-dev mailing list
[hidden email]
https://lists.cloudfoundry.org/mailman/listinfo/cf-dev




--
Matthew Sykes
[hidden email]

_______________________________________________
cf-dev mailing list
[hidden email]
https://lists.cloudfoundry.org/mailman/listinfo/cf-dev



_______________________________________________
cf-dev mailing list
[hidden email]
https://lists.cloudfoundry.org/mailman/listinfo/cf-dev



_______________________________________________
cf-dev mailing list
[hidden email]
https://lists.cloudfoundry.org/mailman/listinfo/cf-dev



_______________________________________________
cf-dev mailing list
[hidden email]
https://lists.cloudfoundry.org/mailman/listinfo/cf-dev




_______________________________________________
cf-dev mailing list
[hidden email]
https://lists.cloudfoundry.org/mailman/listinfo/cf-dev



_______________________________________________
cf-dev mailing list
[hidden email]
https://lists.cloudfoundry.org/mailman/listinfo/cf-dev
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [cf-dev] SSH access to CF app instances on Diego

James Bayer
gwenn, i suspect guillaume is referencing the platform policies around ssh and container lifecycle which has not had more discussion that i'm aware of. i think eric has received some additional feedback from some people, but i have not seen what it is either or an outline of how the defaults and configuration options will be exposed.

On Thu, Jul 30, 2015 at 6:01 PM, Gwenn Etourneau <[hidden email]> wrote:
Guillaume,

Not sure is related but the ssh have been already implemented and is available on diego and cf-release.


On Thu, Jul 30, 2015 at 10:35 PM, Guillaume Berche <[hidden email]> wrote:
Eric,

The CAB minutes [1] mentionned you were still looking for feedback from the community on the policy for altered instances, but this thread seems silent for a while.

Not sure you had seen my email and suggestion below for a way to quarantine the altered instances (beyond the per-space restart policy configuration). Such quarantine request might be a good place to include option to ask for the quarantine instances to be excluded from gorouter traffic.
Regards,

Guillaume.

On Fri, Jul 3, 2015 at 3:56 PM, Guillaume Berche <[hidden email]> wrote:
Hi,

please find my feedback to this thread

short version:
1- need preserve good CF experience with HTTP only (direct SSH flow is still blocked and a pain in many organisations) => +1 to preserve "cf files" or fine tune diego plug to have ssh over HTTP to work out of the box
2- default "recycle tainted containers by default" policy seems good to me
3-  needs to be completed with more control of the recycling policy (UX such as "quarantine" or GAE "lock/unlock" )
4- development use-cases need to be better supported (dev/prod parity) not sure ssh/scp is the right path though

long  version:

1- cf files and ssh over HTTP

As previously mentionned into [1], CF exposing apis over HTTP api made a great job to be easily consummed through HTTP proxies that some companies still use, making CF experience seemless to consumme public paas, or private paas among corporate entities. It seems important to me to preserve good CF experience with HTTP only.

If SSH interactive access, scp and port forwarding become the mainstream solution to operate and troubleshoot apps (supporting "cf files", replacement for the previous DEBUG and CONSOLE ports), it will be useful for users behind such firewalls to be able to configure diego ssh plugin to use HTTP/SOCKS proxies to reach public CF instances. As the diego ssh cli plugin supports using the regular local host ssh binaries, this may potentially be done by tweaking the .ssh config file to add flags associated to host ssh.${domain} to go through proxies (possibly double tunnels as described into  [2]).  However, for new users in such network context, especially on windows operating system, the set up work before using a CF public instance starts to add up?

2- default "recycle tainted containers by default" seems good to me

Given that apps deployed on CF comply to 12 factor apps, there instance may be restarted at anytime (e.g. during a CF new release deployment or stemcell upgrade). So the default policy "recycle tainted containers by default" is not a surprise.

3- need to be completed with more control of the recycling policy (UX such as "quarantine" or GAE "lock/unlock" )

There are some specific use-cases where the "recycle tainted containers by default" policy would be problematic when running applications in production:

An application instance is malfunctionning (e.g. hanging) and an interactive debugging is necessary. The app-ops ssh into the container and starts taking some diagnostic steps (e.g sending kill -SIGTERM signals to take thread dumps, or locally changes log levels).

If ever the ssh connection breaks/timeout, the "recycle tainted containers by default, preventing the current diagnostc to complete.

Another similar use case: a production application is suspected to be compromised by an attacker. App-ops need to capture evidences and understand better how the abuse was done. There isn't enough information in streamed logs, and there is a need to get into the container to inspect the ephemeral FS  and the processes and memory. This may require more than one simultanenous SSH connection, and may span on multiple hours

In both use-cases above, while the application is 12 factor compliant and the "recycle tainted containers by default" policy would be opted in on the corresponding space, there would be a need to transiently turn the mode off.

In term of user experience, this may appear as an explicit user request to "quarantine" the tainted app instances (or the whoe app) so that CF does not attempt to restart them. Or it may appear as the google app engine "lock/unlock"

a call to a new "unlock" command to a CF app instance would be necessary to get SSH access to it. CF then considers this instance as "tained"/untrusted, as it may have deviated from the pushed content, and does not act to it anymore (i.e. does not monitor its bound $PORT or root process exit, which may be handy to diagnose it as wish). When the "lock" command is requested on this instance, Cf destroys this tainted instance, and recreates a fresh new "trusted" one.

4- development use-cases need to be better supported (dev/prod parity) not sure ssh/scp is the right path though

I agree with James Myers that development use-cases should be better supported.

First, CF should strive to support dev-prod parity [4]. However currently, there is not anymore a version of CF that a developper can run on his laptop (e.g. when doing offline development during commute) that would behave like prod and embed buildpacks. There used to have "CF on a single VM". Heroku or GAE have emulators. Cloud rocker [5] is close, but it still takes 10s or more to have changes made on the app be reflected into a running app.

There are some legitimate use cases during development for modifying sources of the application and have those changes be taken in effect immediately. Lots of app development framework supports those development modes (even those that promote test-driven practices), and getting a fast feedback is important. Having dev-prod parity means supporting these use cases while preserving prod behavior (having the VCAP_SERVICES and VCAP_APPLICATION and the buildpack processing applied on the same stack (cflinux2)). Being able to run offline would be even better.

I however believe that providing SSH/SCP access to change the file system to a running app instance may not be the appropriate response, given the FS and the app instance is still ephemeral. Who would want to modify files that could be lost at any time (e.g. a stemcell upgrade ) ?

I'd rather see value in further exploring the ideas layed out by James Bayer into [5] e.g. as a form of a git repo populated with the /home/vcap/app subdir, that developers could clone, push to, and have the instance epheremal FS updated with pushed changes.

This may be combined with a cloudrocker mechanism as to work with a fully offline mode when this is required.
[5] https://docs.google.com/document/d/1_C3OWS6giWx4JL_IL9YLA6jcppyQLVD-YjR0GeA8Z0s/edit#heading=h.toypuu5pxh65



On Thu, Jul 2, 2015 at 10:18 PM, James Myers <[hidden email]> wrote:
I have to agree with Matt on this one. I feel that the recycling of containers is a very anti-developer default. When you approach Cloud Foundry from the perspective of running production applications the recycle policy makes complete sense. However, I feel that this misses out on one of the massive benefits/use cases of Cloud Foundry, what it offers to the development process.

From a security stand point, if you can ssh into a container, it means you have write access to the application in CloudFoundry. Thus you can already push new bits/change the application in question. All of the "papertrail" functionality around pushing/changing applications exists for SSH as well (we record events, output log lines, make it visible to users that action was taken on the application), and thus concerned operators would be able to determine if someone modifying the application in question.

Therefore I'm lost on how this is truly the most secure default. If we are really going by the idea that all defaults should be the most secure, ssh should be disabled by default.

As a developer, I can see many times in which I would want to be able to ssh into my container and change my application as part of a troubleshooting process. Using BOSH as an example, CF Devs constantly ssh into VMs and change the processes running on them in order to facilitate development. BOSH does not reap the VM and redeploy a new instance when you have closed the SSH session. Once again this is largely due to the fact that if you have SSH access, you can already perform the necessary actions to change the application through different means.

Another huge hindrance to development, is that the recycling policy is controlled by administrators. It is not something that normal users can control, even though we allow the granularity of enabling/disabling SSH completely to the end user. This seems counterintuitive.

I feel that a better solution would be to provide the user with some knowledge of which instances may be tainted, and then allowing them to opt into a policy which will reap tainted containers. This provides users with clear insight that their application instance may be a snowflake (and that they may want to take action), while also allowing normal behavior with regards to SSH access to containers.

To summarize, by enabling the recycling policy by default we not only produce extremely unusual behavior / workflows for developers, we are also minimizing the developer-friendliness of CF in general. This mixed with the fact that as a user I cannot even control this policy, leads me to believe that as a default recycling should be turned off as it provides the most cohesive and friendly user experience.

On Mon, Jun 29, 2015 at 9:14 AM, John Wong <[hidden email]> wrote:
> after executing a command, concluding an interactive session, or copying a file into an instance, that instance will be restarted.

How does it monitor the behavior? Is there a list of commands whitelisted? I am curious because I am trying to find out what the whitelist contain. Also is it at the end of the bosh ssh APP_NAME session? What if two users are there simultaneously?

Thanks.

On Mon, Jun 29, 2015 at 5:49 AM, Dieu Cao <[hidden email]> wrote:
I think with the CLI we could add clarifying messaging when using ssh what the current policy around recycling is.
Eric, what do you think about calling it the "recycling" policy, enabled by default? =D

-Dieu


On Sat, Jun 27, 2015 at 3:42 AM, Matthew Sykes <[hidden email]> wrote:
Depends on your role and where your app is in the deployment pipeline. Most of the scenarios I envisioned were for the tail end of development where you need to poke around to debug and figure out those last few problems.

For example, Ryan Morgan was saying that the Cloud Foundry plugin for eclipse is going to be using the ssh support in diego to enable debug of application instances in the context of a buildpack deployed app. This is aligned with other requirements I've heard from people working on dev tools.

As apps reach production, I would hope that interactive ssh is disabled entirely on the prod space leaving only scp in source mode as an option (something the proxy can do).

Between dev and prod, there's a spectrum, but in general, I either expect access to be enabled or disabled - not enabled with a suicidal tendency.

On Thu, Jun 25, 2015 at 10:53 PM, Benjamin Black <[hidden email]> wrote:
matt,

could you elaborate a bit on what you believe ssh access to instances is for? 


b


On Thu, Jun 25, 2015 at 9:29 PM, Matthew Sykes <[hidden email]> wrote:
My concern is the default behavior.

When I first prototyped this support in February, I never expected that merely accessing a container would cause it to be terminated. As we can see from Jan's response, it's completely unexpected; many others have the same reaction.

I do not believe that this behavior should be part of the default configuration and I do believe the control needs to be at the space level. I have have already expressed this opinion during Diego retros and at the runtime PMC meeting.

I honestly believe that if we were talking about applying this behavior to `bosh ssh` and `bosh scp`, few would even consider running in a 'kill on taint mode' because of how useful it is. We should learn from that.

If this behavior becomes the default, I think our platform will be seen as moving from opinionated to parochial. That would be unfortunate.


On Thu, Jun 25, 2015 at 6:05 PM, James Bayer <[hidden email]> wrote:
you can turn the "restart tainted containers" feature off with configuration if you are authorized to do so. then using scp to write files into a container would be persisted for the lifetime of the container even after the ssh session ends.

On Thu, Jun 25, 2015 at 5:50 PM, Jan Dubois <[hidden email]> wrote:
On Thu, Jun 25, 2015 at 5:36 PM, Eric Malm <[hidden email]> wrote:
> after executing a command, concluding an
> interactive session, or copying a file into an instance, that instance will
> be restarted.

What is the purpose of being able to copy a file into an instance if
the instance is restarted as soon as the file has been received?

Cheers,
-Jan
_______________________________________________
cf-dev mailing list
[hidden email]
https://lists.cloudfoundry.org/mailman/listinfo/cf-dev



--
Thank you,

James Bayer

_______________________________________________
cf-dev mailing list
[hidden email]
https://lists.cloudfoundry.org/mailman/listinfo/cf-dev




--
Matthew Sykes
[hidden email]

_______________________________________________
cf-dev mailing list
[hidden email]
https://lists.cloudfoundry.org/mailman/listinfo/cf-dev



_______________________________________________
cf-dev mailing list
[hidden email]
https://lists.cloudfoundry.org/mailman/listinfo/cf-dev




--
Matthew Sykes
[hidden email]

_______________________________________________
cf-dev mailing list
[hidden email]
https://lists.cloudfoundry.org/mailman/listinfo/cf-dev



_______________________________________________
cf-dev mailing list
[hidden email]
https://lists.cloudfoundry.org/mailman/listinfo/cf-dev



_______________________________________________
cf-dev mailing list
[hidden email]
https://lists.cloudfoundry.org/mailman/listinfo/cf-dev



_______________________________________________
cf-dev mailing list
[hidden email]
https://lists.cloudfoundry.org/mailman/listinfo/cf-dev




_______________________________________________
cf-dev mailing list
[hidden email]
https://lists.cloudfoundry.org/mailman/listinfo/cf-dev



_______________________________________________
cf-dev mailing list
[hidden email]
https://lists.cloudfoundry.org/mailman/listinfo/cf-dev




--
Thank you,

James Bayer

_______________________________________________
cf-dev mailing list
[hidden email]
https://lists.cloudfoundry.org/mailman/listinfo/cf-dev
12
Loading...