We understand that new tools often require review by IT, security, and data governance teams before adoption. Below are the most common questions we receive, answered directly and specifically.
What does MyDataWork actually do?
MyDataWork is a workspace for data analysts, BI developers, data engineers, and data scientists to catalog their files, document use cases, track outcomes, and visualize relationships between data assets. It is an organizational and productivity tool — not a data processing or transformation platform. It does not query, transform, or move data within your systems.
What data does MyDataWork access or store?
MyDataWork stores only metadata — never the contents of any data file. The metadata captured falls into three layers.
Filesystem metadata (what the operating system already exposes): file names, file paths, file types and tool classifications (Excel, SQL, Python, Jupyter notebooks, Alteryx, Tableau, Power BI, CSV, ThoughtSpot, Dataiku project exports, Looker LookML), file sizes, file creation dates, and last-modified dates.
Shallow structural metadata (extracted from a file’s structure, never its values): for SQL scripts, the table names referenced in FROM/JOIN clauses; for Excel files, sheet names; for CSV files, column names; for Jupyter notebooks, embedded SQL; for Alteryx workflows, connection hints; for ThoughtSpot .tml, object kind and referenced tables; for Dataiku project exports, project name and recipe/dataset counts; for Looker .lkml, dimensions, measures, and sql_table_name references. Data values, formula contents, and cell contents are never read. The structural metadata is what powers the lineage map and the “Not in catalog” external-dependency badges.
User-entered content: tags, notes, use case descriptions, objectives, progress measurements, stakeholder information, optional notes URLs (HTTPS links to external documentation), and analyst-defined relationships.
The local Connector indexes the file-level and shallow-structural metadata listed above on your machine; no file content ever leaves your computer. When AI features are used, MyDataWork sends a subset of the metadata above (use case descriptions, objectives, progress notes, asset names and types, detected data source references, stakeholder names) — never file contents — to OpenAI’s API for analysis.
Where is data stored?
All workspace data is stored in a managed PostgreSQL database (AWS RDS) hosted in the United States (us-east-1 / Northern Virginia region). The database runs in a Multi-AZ configuration with a synchronous standby in a second Availability Zone, is encrypted at rest using AWS KMS, has automated daily backups with 7-day retention and point-in-time recovery, and is not publicly accessible from the internet — connections are accepted only from within the application VPC.
Does MyDataWork connect to our databases, data warehouse, or BI tools?
MyDataWork does not connect to, query, or read data from your databases, data warehouses, or BI tools. It does not execute queries or access underlying data.
MyDataWork can optionally connect to cloud platforms — GitHub, dbt Cloud, Databricks, and Snowflake — using read-only API tokens provided by the user. These connections index only asset metadata exposed by each platform’s API: file names, paths, model names, table names, and schema information. No data records, query results, or file contents are accessed or stored.
The local Connector recognizes a defined set of file types by extension — including Excel (.xlsx, .xlsm), CSV, SQL (.sql), Python (.py, .ipynb), Tableau (.twb, .twbx), Power BI (.pbix, .pbit), Alteryx (.yxmd), ThoughtSpot (.tml), Dataiku project exports (.zip), and Looker LookML (.lkml). For each, it captures filesystem metadata (name, path, size, creation and modification dates) and performs shallow structural inspection to extract schema-level references — for example, table names referenced in SQL FROM/JOIN clauses, sheet names from .xlsx, column names from .csv, dimensions/measures from .lkml. Data values, formula contents, and cell contents are never read or transmitted. This structural metadata is what powers the lineage map and the external-dependency flags.
Cloud source connections are optional, require explicit configuration by a workspace admin, and use only the minimum API permissions needed for metadata indexing.
What data is sent to AI services?
MyDataWork uses OpenAI’s API to power AI features including use case recommendations, automation candidate identification, marketplace data suggestions, tool migration analysis, the workspace assistant, and the Workspace Agent. When these features are used, the following information is sent to OpenAI: use case titles, descriptions, objectives, and progress notes; asset names, file types, and detected data source references; stakeholder names; and workspace plan and configuration details.
The Workspace Agent specifically runs only when a user clicks “Analyze my workspace” — it is not autonomous and does not run on a schedule. When activated, it evaluates the workspace against a fixed set of business rules (for example, identifying assets linked to multiple use cases or use cases without assigned stakeholders) and sends the relevant metadata to OpenAI only for the purpose of phrasing the resulting suggestion in natural language. The detection itself is performed by deterministic logic within MyDataWork; OpenAI is not used to make the underlying judgment about whether a finding is valid.
The MyDataWork Assistant has context about your workspace structure (assets, use cases, stakeholders, and progress) so it can answer questions specific to your work. It does not access file contents at any time.
File contents are never sent to OpenAI or any external service. AI features can be disabled in workspace settings if you do not wish to use them. AI usage is subject to a daily quota visible in the Setup tab, and the cost of every AI action is disclosed at the point of clicking. If an AI call fails for any reason, the credit is automatically refunded. OpenAI’s data processing terms apply to data sent through the API.
Is the Asset Estate Assessment handled differently from other AI features?
The Asset Estate Assessment is an AI-powered review of the workspace, runnable on demand. The first run is free on every plan; you can re-run it any time afterward (subsequent runs use AI credits like other AI features). It uses OpenAI in the same way other AI features do — file contents are never sent. What’s distinct from the other AI surfaces:
What is sent to OpenAI: asset metadata (names, paths, tool types, topic tags, lineage edges) and use case text (titles, objectives, descriptions, value figures, stakeholders) — never file contents. The assessment additionally sends structural observations derived from the lineage graph (cross-use-case asset concentration, external dependencies, orphan assets) and a snapshot of recent Workspace Agent findings.
Workspace-grounding filter: every finding the LLM produces is checked against the workspace’s actual assets, use cases, and stakeholders by a deterministic filter before it’s surfaced. Findings that name no specific workspace entity (template phrasings like “consolidate your KPI dashboards”) are dropped during synthesis. This is implemented in the application, not in the LLM call, so it can’t be bypassed by prompt-injection.
Eligibility tracking and the hashed-email log: the free-first-run affordance is enforced via a separate eligibility log keyed on a SHA-256 hash of the lowercased email. The plaintext email is not stored in the log — only the hash and a timestamp. This is the only place in MyDataWork where a per-email record persists across workspaces; its sole purpose is to prevent a user from re-claiming the free affordance by registering a new workspace under the same email. The hash is one-way; a security reviewer auditing this table sees timestamps and hex strings, not email addresses.
Storage of the result: each assessment run persists the rendered five-section markdown report to the workspace’s database, scoped to the workspace’s org_id. Users can revisit and re-export past assessments. The markdown is the artifact the workspace owner sees and exports; the LLM intermediate outputs are not retained.
Disabling: as with other AI features, the Assessment is gated by the AI-enabled setting in workspace settings. Disabling AI hides the Assessment card entirely.
Does the Jira integration store or transmit sensitive data?
The Jira integration stores your Jira instance URL and API token encrypted at rest in your MyDataWork workspace, using the same encryption applied to all other credentials in the system. These credentials are used only to push use case summaries to your Jira instance and are never transmitted elsewhere.
The content pushed to Jira contains only information you have explicitly entered into MyDataWork — use case titles, descriptions, objectives, progress notes, stakeholder names, and asset names. No file contents, no underlying data, and no system credentials other than the Jira token itself are involved in the integration.
The Jira integration is optional, disabled by default, and must be explicitly configured by a workspace admin in Setup → Integrations. Jira integration calls are rate-limited to 20 per hour to ensure reliable platform performance.
Does installing the Connector require admin privileges or network changes?
The Connector is a lightweight Windows desktop application. It typically does not require administrator privileges to install, does not open inbound network ports, does not modify system files, and does not require firewall rule changes. It communicates only with the MyDataWork web application over standard HTTPS (port 443).
What network access does MyDataWork require?
Users need outbound HTTPS access to app.mydatawork.com on port 443. No VPN, no special network configuration, and no inbound connections are required. The application is fully cloud-hosted. If the optional Jira integration is configured, outbound HTTPS connections are also made to your organization’s Jira instance URL on port 443.
MyDataWork applies rate limits to certain features to ensure reliable platform performance for all users: lineage rebuilds (5 per day), Jira integration calls (20 per hour), and exports (10 per day). Rate limits are documented in the in-product help and apply to normal usage.
Can access be limited to specific users or teams?
Yes. MyDataWork supports workspace-level access control. Team plans designate a single workspace admin (Owner) who manages member access. The admin creates member accounts directly by entering each member’s name, email, and starter password — there is no email-invite flow into an existing workspace, and a user cannot self-add themselves to a workspace they have not been added to by an admin. (Individuals can self-register at app.mydatawork.com to create a new workspace of their own on the Explorer plan; they cannot self-join an existing workspace.)
Team plan tiers enforce seat limits: Team Starter supports 2 to 5 members total, Team Growth supports 6 to 10 members. Admins can remove members at any time, transfer admin ownership to another member, or delete the workspace entirely. When a member is removed, their contributions remain with the workspace under the admin’s control.
How is data isolated between members within a Team workspace?
Team workspaces use a private-by-default model. Each member has their own personal workspace within the team where their assets, use cases, lineage, stakeholders, AI recommendations, and insights are private to them — including the workspace admin’s own work. The default view on every surface (Assets, Use Cases, Suggestions, Dashboard) is filtered to the assets the viewing member can see: their own, plus any asset another member has explicitly published to the team via the Share to Team action on the asset detail panel.
Only assets are shareable. Use cases, lineage, stakeholders, AI recommendations, and insights remain personal to their creator regardless of asset sharing. The person who shared an asset (or the team admin) can unshare it at any time; copies others have already made via the “Copy to mine” action are independent and unaffected.
Admin Workspace view: Team admins additionally have a Workspace tab on the Assets, Use Cases, and Suggestions surfaces that exposes the full workspace catalog across all members’ private content — explicitly opt-in (not the default), intended for onboarding, audit, and oversight tasks. Use of the Workspace view is logged.
No cross-tenant access: each customer organization is logically isolated at the database level using an org_id foreign key on all data tables. Cross-workspace data access is not possible through the application layer.
The Asset Health Dashboard respects the same private-by-default scope — every count and panel reflects the assets the viewing member can see, matching the default view on the Assets tab.
Can users delete their own accounts and data?
Yes. Users can delete their own accounts and all associated data from the Account section at any time. Account deletion is immediate and permanent — data is removed without the 30-day retention period that applies to subscription cancellation. For Team plan admins, ownership must be transferred to another member before deletion. All deletion events are logged for audit purposes with hashed identifiers, not full email addresses.
Subscription cancellation is a separate action that retains data for 30 days, allowing the user to resubscribe and recover access. Self-service account deletion bypasses this retention period and removes data immediately.
Can users export and take their data with them?
Yes. MyDataWork supports comprehensive data portability through multiple export options:
Full workspace export (JSON): Users can export their entire workspace as a JSON file at any time from Setup → Data portability. The export includes all assets, lineage edges, stakeholders, use cases (with objectives, progress measurements, baseline/current/target values, priorities, effort estimates, target dates, notes URLs, communication logs), saved AI recommendations, action plans, and Workspace Agent suggestion history. The exported JSON can be imported into any MyDataWork workspace, supporting both backup and migration use cases.
Portfolio exports: Use cases, portfolios, and lineage diagrams can be exported as PowerPoint or PDF documents directly from the application.
Subscription cancellation: Upon subscription cancellation, data is retained for 30 days to allow for export and potential resubscription before permanent deletion.
Can we use MyDataWork as part of a backup or disaster recovery strategy?
The JSON workspace export is suitable for self-managed backups. Users can export their complete workspace at any time, store the JSON file in their organization’s backup system, and restore by importing the JSON into a workspace. The import is a merge operation — existing data is updated and new data is added, but nothing is deleted — so the same JSON can be re-imported safely.
MyDataWork itself maintains operational backups of the production database — automated daily snapshots with point-in-time recovery within the retention window. The database runs in a Multi-AZ configuration with a synchronous standby in a second Availability Zone for high availability. The restore-from-snapshot procedure is documented and periodically tested. Customer-managed JSON exports provide an additional layer of independent data portability that does not depend on MyDataWork’s availability.
Is MyDataWork SOC 2 certified?
SOC 2 certification is on our roadmap. Our infrastructure runs on AWS, which maintains SOC 2, ISO 27001, PCI DSS, and other certifications. We follow defense-in-depth security practices including:
- Encryption at rest (AWS KMS) for the production database and per-customer integration credentials; encryption in transit via TLS 1.2+ with enforced HTTPS.
- Application secrets in AWS Secrets Manager, resolved at container start through a scoped IAM role — secrets are not stored in plaintext configuration.
- Network isolation: the production database is not publicly accessible from the internet, and the application tier runs in private subnets.
- High availability and backups: the database runs Multi-AZ with automated daily backups, point-in-time recovery within the retention window, and a restore procedure that is documented and periodically tested.
- Continuous monitoring with alerting on application errors, database health, and infrastructure saturation.
How are user credentials protected?
Passwords are never stored in plaintext. All passwords are hashed using industry-standard bcrypt. Authentication uses short-lived JWT tokens. Users can reset passwords via email verification. We do not store payment card information; payments are processed by Stripe.
Does MyDataWork comply with GDPR or CCPA?
MyDataWork supports key rights established by GDPR and CCPA:
- Right to data portability: Users can export their complete workspace as a JSON file at any time, satisfying GDPR Article 20 requirements for structured, machine-readable export.
- Right to erasure: Users can delete their own accounts and all associated data via self-service deletion at any time. Deletion is immediate and permanent.
- Data minimization: MyDataWork stores only the data users explicitly enter and metadata users explicitly choose to index. We do not sell user data and do not use data for advertising.
We are happy to discuss data processing agreements for organizational customers.
What happens if we want to stop using MyDataWork?
There is no lock-in. Export your complete workspace as JSON at any time, or export portfolios and use cases as PowerPoint or PDF. Cancel your subscription from Account settings — data is retained for 30 days after cancellation. For immediate, complete removal, use the self-service account deletion option in your Account section. The Connector can be uninstalled like any standard Windows application.
Who do we contact with security questions?
Email contact@mydatawork.com. We aim to respond to security-related inquiries within 2 business days.